Virtual file-sharing network

ABSTRACT

A method for enabling access to a data resource, which is held on a file server ( 25 ) on a first local area network (LAN) ( 21   a ), by a client ( 28 ) on a second LAN ( 21   b ). A proxy receiver ( 48 ) on the second LAN ( 21   b ) intercepts a request for the data resource submitted by the client ( 28 ) and transmits a message via a wide area network (WAN) ( 29 ) to a proxy transmitter ( 52 ) on the first LAN ( 21   a ), requesting the data resource. The proxy transmitter ( 52 ) retrieves a replica of the data resource from the file server ( 25 ) and conveys the replica of the data resource over the WAN ( 29 ) to the proxy receiver ( 48 ), which serves the replica of the data resource from the proxy receiver ( 48 ) to the client ( 28 ) over the second LAN ( 21   b ).

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional PatentApplications Nos. 60/309,050, filed Aug. 1, 2001; 60/331,582, filed Nov.20, 2001; and 60/338,593, filed Dec. 11, 2001, all of which areincorporated herein by reference.

FIELD OF THE INVENTION

[0002] The present invention relates generally to computer file systems,and specifically to computer file sharing in a distributed networkenvironment.

BACKGROUND OF THE INVENTION

[0003] Geographically dispersed enterprises often deploy distributedcomputer systems in order to enable information sharing throughout theenterprise. Such distributed systems generally comprise a number oflocal area networks (LANs) that are connected into one or more wide areanetworks (WANs). Enterprises have commonly used dedicated leased linesor permanent virtual circuits, such as frame relay links, to connecttheir LANs and WAN end-points. While providing generally predictablebandwidth and quality of service, such interconnections are oftenexpensive and represent fixed costs for an enterprise. More recently,with the development of the Internet, many enterprises have begun to usevirtual private networks (VPNs) operating over the public Internet, atleast for a portion of their data traffic. Although VPNs are typicallyless expensive than dedicated lines, bandwidth and latency are oftenunpredictable, particularly when transmitting large files over longdistances.

[0004] Many LANs include one or more dedicated file servers that receivedata from other processors on the LAN via the network for storage on thefile servers' hard disks, and supply data from the file servers' harddisks to the other processors via the network. Data stored on fileservers is often accessed using a distributed file system, the mostprevalent of which are Network File System (NFS), primarily used forUNIX clients, and Common Internet File System (CIFS, formerly SMB), usedfor Windows® clients.

[0005] Because these network file systems were primarily designed foruse with high-bandwidth LANs, file access over WANs is often slow,particularly when interconnection is over a VPN. Numerous and frequentaccesses to remote file servers are often necessary for most fileoperations, which sometimes result in noticeably poor performance of theclient application.

[0006] In an attempt to improve response time, techniques of replicationand caching are often used. Replication entails maintaining multipleidentical copies of data, such as files and directory structures, indistributed locations throughout the network. Clients access, eithermanually or automatically, the local or topologically closest replica.The principal drawback of replication is that it often requires highbandwidth to maintain replicas up-to-date and ensure a certain amount ofconsistency between the replicas. Additionally, strong consistency isoften very difficult to guarantee as the number of replicas increaseswith network size and complexity.

[0007] In standard cache implementations, clients maintain filesaccessed from the network file system in local memory or on local disk.Subsequent accesses to the cached data are performed locally until it isdetermined that the cached data is no longer current, in which case afresh copy is fetched. While caching does not necessarily require highbandwidth, access to large non-cached files (such as for each firstaccess) is sometimes unacceptably slow, particularly if using a VPNcharacterized by variable bandwidth and latency. Maintaining consistencyis complex and often requires numerous remote validation calls while afile is being accessed.

[0008] U.S. Pat. No. 5,611,049 to Pitts, which is incorporated herein byreference, describes a distributed caching system for accessing a nameddataset stored at a server connected to a network. Some of the computerson the network function as cache sites, and the named dataset isdistributed over one or more such cache sites. When a client workstationpresents a request for the named dataset to a cache site, the cache sitefirst determines whether it has the dataset cached in its buffers. Ifthe cache does not have the dataset, it relays the request to anothercache site topologically closer to the server wherein the dataset isstored. This relaying may occur more than once. Once a copy of thedataset is found, either at an intermediary cache site or on the server,the dataset is sent to the requesting client workstation, where it maybe either read or written by the workstation. The cache sites maintainabsolute consistency between the source dataset and its copies at allcache sites. The cache sites accumulate profiling data from the datasetrequests. The cache sites use this profiling data to anticipate futurerequests to access datasets, and, whenever possible, prevent any delayto client workstations in accessing data by asynchronously pre-fetchingthe data in advance of receiving a request from a client workstation.

[0009] U.S. Pat. No. 6,085,234 to Pitts et al., which is incorporatedherein by reference, describes a network-infrastructure cache- thattransparently provides proxy file services to a plurality of clientworkstations concurrently requesting access to file data stored on aserver. A file-request service-module of the network-infrastructurecache receives and responds to network-file-services-protocol requestsfrom workstations. A cache included in the network-infrastructure cachestores data that is transmitted back to the workstations. A file-requestgeneration-module, also included in the network-infrastructure cache,transmits requests for data to the server, and receives responses fromthe server that include data missing from the cache.

[0010] While providing an improvement in network file systemperformance, caching introduces potential file inconsistencies betweendifferent cached file copies. A data file is considered to have strongconsistency if the changes to the data are reconciled simultaneously toall clients of the same data file. Weak consistency allows the copies ofthe data file to be moderately, yet tolerably, inconsistent at varioustimes. File systems can ensure strong consistency by employingsingle-copy semantics between clients of the same data file. Thisapproach typically utilizes some form of concurrency control, such aslocking, to regulate shared access to files. Because achieving singlecopy semantics incurs a high overhead in a distributed file systems,many file systems opt for weaker consistency guarantees in order toachieve higher performance.

[0011] Cache consistency can be achieved through either client-drivenprotocols, in which clients send messages to origin servers to determinethe validity of cached resources, or server-driven protocols, in whichservers notify clients when data changes. Protocols using client-drivenconsistency, such as NFS (Versions 1, 2 and 3) and HTTP 1.x, either pollthe server on each access to cache data in order to ensure consistentdata, thereby increasing both latency and load, or poll the serverperiodically, which incurs a lower overhead on both the server andclient but risks supplying inconsistent data. Server-driven consistencyprotocols, such as Coda and AFS, described below, improve clientresponse time by allowing clients to access data without contacting theorigin server, but introduce challenges of their own, mostly withrespect to server load and maintaining consistency despite network orprocess failures.

[0012] When client-driven protocols are used in an environment requiringstrong consistency, they incur high validation traffic from clients toservers. This is undesirable in high-latency networks, as each readoperation must suffer a round trip delay to validate the cached data.HTTP proxy caches have traded reduced consistency for improved accessperformance, a rational design choice for most Web content. Eachresource is associated with an expiry timestamp, often derived by someheuristic from its modification and access times. The timestamp is usedto compute the resource's freshness. A cache proxy may serve anynon-expired resource without first consulting the origin HTTP server.For requests targeting expired resources, the proxy must firstrevalidate its cached copy with the origin site before replying to theclient. It is important to note that HTTP uses heuristics that reducethe chance of inconsistencies, but no hard guarantees can be maderegarding actual resource validity between validations because theserver may freely modify the resource while it is cached by clients.

[0013] Server-driven protocols rely on the server to notify clients ofchanges in the attributes or content of the resource. Each servermaintains a list of clients possessing a cached copy of a resource. Whena cached resource is modified by a client, the server notifies allclients possessing a cached copy, forcing them to revalidate theircopies before allowing further access to cached data. The serveraccomplishes this notification by making a callback to each client. (Acallback is a remote procedure call from a server to a client.) Theguaranteed notification relieves clients of having to continuously pollthe server to determine validity, resulting in lower client, server andnetwork loads, when changes are relatively infrequent compared with theoverall access. However, the use of callbacks increases the burden ofmanaging the server state (to maintain all client callbacks) anddecreases system failure resilience (as the server is required tocontact possibly-failed clients). CIFS and NFS Version 4 are statefulprotocols. Some hybrid server-/client-driven protocols use leases forlock management. Leases grant control of a resource to a client for aserver-specified fixed amount of time, and are renewable by the client.While the lease is in effect, the server may not grant conflictingcontrol to another client. Therefore, during a lease, a client canlocally use the resource for reading or writing without repeatedlychecking the status of the resource with the file server. The NFSVersion 4 protocol implements leases for both locks and delegation. Thisfeature is described by Pawlowski et al., in “The NFS Version 4protocol,” published at the System Administration and Networking (SANE)Conference (May 22-25, 2000 MECC, Maastricht, The Netherlands), which isincorporated herein by reference. This paper is available atwww.nluug.nl/events/sane2000/papers/pawlowski.pdf. Leases or token-basedstate management also exists in several other distributed file systems.

[0014] NFS has implemented several techniques designed to improve fileaccess performance over a WAN. NFS clients often pre-fetch data from afile server into the client cache, by asynchronously reading ahead whenNFS detects that the client is accessing a file sequentially. NFSclients also asynchronously delay writing to the file server modifieddata in the client's cache, in order to maintain the client's access tothe cached data while the client is waiting for confirmation from thefile server that the modified data has been received. Additionally, NFSuses a cache for directories of files present on the file server, and acache for attributes of files present on the file server.

[0015] A number of other distributed file systems, less widely-used thanNFS and CIFS, have been developed in an attempt to overcome theperformance issues encountered when using distributed file systems overWANs. These file systems use client caching, replication of information,and optimistic assumptions (local read, local write). These file systemsalso typically require the installation of a custom client and acustomer server implementation. They do not generally support thestandard file systems, such as NFS and CIFS.

[0016] For example, the Andrew File System (AFS), which is now an IBMproduct, is a location-independent file system that uses a local cacheto reduce the workload and increase the performance of a distributedcomputing environment. The system was specifically designed to providevery good scalability. AFS caches complete files from the file serverinto the clients, which are required to have local hard disk drives. AFShas a global name space and security architecture that allows clients toconnect to many separate file servers using a WAN.

[0017] Coda is an advanced networked file system developed at CarnegieMellon University. Coda's design is based on AFS, with added support formobile computing and additional robustness when the system experiencesnetwork problems and server failures. Coda attempts to achieve highperformance through client-side persistent caching. The system was alsodesigned to achieve good scalability.

[0018] InterMezzo is an Open Source (GPL) project included in the Linuxkernel. InterMezzo's development began at Carnegie Mellon University,and was inspired by Coda. When several clients are connected to a fileserver, InterMezzo decides which client is permitted to write using amechanism called a “write lease” or “write token.” Only one client canhold a write lease or token to a file at any given time, eliminatingupdate conflicts. In InterMezzo, all clients are immediately notified ofany updates to any directories to which they are connected. As a result,exported directories on all clients are always kept synchronized so longas all clients are connected to the network. Coda and InterMezzo aredescribed by Braam et al., in “Removing bottlenecks in distributedfilesystems: Coda & InterMezzo as examples,” published in theProceedings of Linux Expo 1999 (May 1999), which is incorporated hereinby reference. This paper is available atwww-2.cs.cmu.edu/afs/cs/project/coda-www/ResearchWebPages/docdir/linuxexpo99.pdf.

[0019] Ficus, developed at the University of California Los Angeles, isa replicated general filing environment for UNIX, which is intended toscale to very large networks. The system employs an optimistic “one copyavailability” model in which conflicting updates to the file system'sdirectory information are automatically reconciled, while conflictingfile updates are reliably detected and reported. The system architectureis based on a stackable layers methodology. Unlike AFS, Coda, andInterMezzo, which employ client-server models, Ficus employs apeer-to-peer model. Ficus is discussed by Guy et al., in “Implementationof the Ficus replicated file system,” Proceeding of the Summer USENIXConference (Anaheim, Calif., June 1990), 63-71, and by Page et al., in“Perspectives on optimistically replicated, peer-to-peer filing,”Software: Practice and Experience 28(2) (1998), 155-180, which areincorporated herein by reference.

SUMMARY OF THE INVENTION

[0020] It is an object of some aspects of the present invention toprovide improved methods, systems and software products for file sharingover wide area networks.

[0021] In preferred embodiments of the present invention, a distributedcomputer system comprises two or more geographically-remote local areanetworks (LANs) interconnected into a wide area network (WAN). Thesystem includes one or more file servers, which are located onrespective LANs. The present invention provides a Virtual File-SharingNetwork (VFN)™ to enable client computers on one LAN to efficientlyaccess files held by file servers on other LANs.

[0022] The VFN comprises two or more VFN gateways, each of which isconnected to a different LAN. The VFN gateways communicate with oneanother over the interconnection provided by the WAN. In order to servea resource from a file server on a first LAN to a client on a secondLAN, the VFN gateway on the first LAN fetches the resource from the fileserver and transmits the resource over the WAN to the VFN gateway on thesecond LAN, which then serves the resource to the client. (The same VFNgateways may be used to provide resources from another file server onthe second LAN to clients on the first LAN.) The VFN system thus may beviewed as a “double-proxy” system, in which file system requests areintercepted by the local VFN gateways, which fulfill the requests bycommunicating with remote VFN gateways. This architecture enablesclients and file servers to interact transparently via their standardnative network file system interfaces, without the need for special VFNclient or server software. A single VFN system may simultaneouslysupport multiple native files systems and network protocols.

[0023] Remote resources are efficiently and transparently made availableto clients by a combination of file replicating and caching, andon-demand retrieval. These functions are performed by a receivercomponent of the VFN gateway, which serves the clients that are locatedon the same LAN as the gateway. (A transmitter component of the VFNgateway is responsible for communicating with local file servers.)Selected resources are replicated (“pre-positioned”) prior to a clientrequest. Policies and algorithms are used to determine which resourcesto pre-position and when to pre-position resources, based oncharacteristics of the resources and the availability of bandwidth andlocal storage. Preferably, the policies are set so that resources withhigher ratios of expected usage to expected modifications are morelikely to be pre-positioned. Look-ahead fetching is employed byanalyzing real-time file usage patterns to detect sequential accesspatterns.

[0024] The VFN receiver component retrieves and caches a requestedresource on-demand if the resource has not previously beenpre-positioned or cached, or if the cached version of the resource hasbecome outdated. Advantageously, because the VFN gateway cachesresources centrally for the LAN, when more than one client on the LANrequests the same resource, the resource is served locally without theneed for redundant remote transfers. As a result, the VFN systemexploits similarities in access patterns of multiple clients in order toreduce bandwidth consumption and quickly serve resources. Additionally,the VFN system preferably implements negative caching, whereby when aVFN gateway on another LAN responds that requested content is not found,this negative response is cached by the requesting VFN receiver for acertain amount of time, so that the same request will not be repeatedunnecessarily. Negative caching generally reduces bandwidth consumptionand reduces resource request response time.

[0025] Each VFN receiver maintains a virtual directory of files held byremote file servers on other LANs. All registered directory trees fromthe remote servers are pre-positioned in the virtual directory. The VFNreceiver keeps the directory information up-to-date, irrespective offile requests by its local clients. When the VFN receiver intercepts arequest for file directory information or file metadata from one of thelocal clients, the VFN receiver looks up the information on its localvirtual directory. The VFN receiver then returns the requestedinformation directly to the client, avoiding the delay that wouldotherwise be involved in requesting and receiving the information fromthe remote file server across the WAN.

[0026] The virtual directory preferably includes metadata, including allfile attributes that might be requested by a client application, such assize, modification time, creation time, and file ownership. If necessary(as in the case of NFS, for example), the VFN system extracts thismetadata from within the files stored on the origin file server, whereinthe metadata is ordinarily kept. Local storage of this metadata in thevirtual directory has several advantages. Many file system operationsrequire attributes of numerous files without requiring the content ofthose files. The virtual directory precludes the need to transfer andstore these unnecessary complete files. By use of the local virtualdirectory, the VFN receiver provides the client with fast response timeto metadata-only operations, such as browsing the file system andproperty checking, as well as for performing permission and validationchecks against these attributes.

[0027] Preferably, VFN gateways on different LANs are connected to oneanother by a transport sub-system, which is based on a novelWAN-oriented protocol. This protocol ensures reliable and efficient useof available WAN bandwidth. At the same time, communications between theVFN gateways and their local clients and file servers operate inaccordance with LAN-oriented protocols, typically emulating the standardclient/server protocols used by the native file system. This arrangementenables seamless integration with existing LAN protocols, whileproviding effective performance over the WAN. To achieve efficiency, thetransport sub-system preferably uses compression and delta transfertechniques, and, when appropriate, parallel connections to multipleremote VFN transmitters, multi-source routing, and throttling. Effectiveuse of WAN bandwidth also reduces the impact of VFN traffic on otherapplications using the WAN.

[0028] In some preferred embodiments of the present invention, the VFNsystem is configured to provide strong consistency for files anddirectories by using a server-driven lease-based consistency protocolbetween VFN gateways. An access lease provides a VFN receiver withpermission to perform specified operations (including writing) during aspecified length of time, independent of the VFN receiver's peer VFNtransmitter. Preferably, the VFN uses a lease model that provides aneffective balance between VFN receiver poling and VFN transmitter state.Consistency between the VFN receiver and clients is provided by theconsistency protocols of the client's native file system. Consistencybetween the VFN transmitter and the origin file server is preferablyprovided by using a watchdog VFN file agent deployed in the origin fileserver. Alternatively, the VFN system may be configured for weak orintermediate consistency.

[0029] In some preferred embodiments of the present invention, the VFNsystem includes a VFN manager, which centrally manages all VPN gatewaysand administers the VFN system's policy control mechanism. Policies maybe edited via a multi-user GUI console, and are translated into atag-based markup language. Policies include various distribution-relatedattributes that may be assigned to any given set of files ordirectories, such as priorities, conditional pre-fetching properties,cache consistency attributes, and active refresh rifles. Policies areperiodically downloaded from the VFN manager by control agents in theVFN gateways. Additionally, the VFN manager periodically collectsactivity logs from the control agents, and analyzes this data togenerate various activity analyses and reports.

[0030] There is therefore provided, in accordance with a preferredembodiment of the present invention, a method for enabling access to adata resource, which is held on a file server on a first local areanetwork (LAN), by a client on a second LAN, the method including:

[0031] intercepting a request for the data resource submitted by theclient, using a proxy receiver on the second LAN;

[0032] transmitting a message via a wide area network (WAN) from theproxy receiver to a proxy transmitter on the first LAN, requesting thedata resource;

[0033] retrieving a replica of the data resource from the file server tothe proxy transmitter;

[0034] responsive to the message, conveying the replica of the dataresource over the WAN from the proxy transmitter to the proxy receiver;and

[0035] serving the replica of the data resource from the proxy receiverto the client over the second LAN.

[0036] As appropriate, the data resource may include a file, a block ofa file, a page of content encoded in a markup language, and/or a filesystem directory. Conveying the replica of the data resource may includeconveying metadata relating to the data source, conveying an access listapplicable to the data resource, and/or conveying the replica of thedata resource includes conveying a permission applicable to the dataresource.

[0037] In a preferred embodiment, retrieving the replica includesmonitoring the file server using a watchdog agent to detect a changemade to the data resource by a native client on the first LAN, andretrieving the replica of the data resource from the file server to theproxy transmitter again responsive to the change.

[0038] In a preferred embodiment, intercepting the request includesintercepting a lock request submitted by the client for a lock on thedata resource, and transmitting the message includes transmitting a lockmessage via the WAN from the proxy receiver to the proxy transmitter,requesting the lock, and including:

[0039] responsive to the lock message, issuing the lock at the proxytransmitter;

[0040] conveying the lock over the WAN from the proxy transmitter to theproxy receiver; and

[0041] serving the lock from the proxy receiver to the client.

[0042] Preferably, retrieving the replica of the data resource from thefile server includes checking the file server to determine whether thedata resource is held by the file server, and conveying the replica ofthe data resource from the proxy transmitter to the proxy receiverincludes conveying a negative response relating to the data resourceover the WAN from the proxy transmitter to the proxy receiver when it isdetermined that the data resource is not held by the file server, andthe method includes caching the negative response at the proxy receiverfor a certain period. Preferably, transmitting the message from theproxy receiver to the proxy transmitter includes checking whether thenegative response relating to the requested data resource is present andnot expired, and, responsive to determining that the negative responseis present and not expired, withholding transmitting the message to theproxy transmitter, and serving the negative response from the proxyreceiver to the client over the second LAN.

[0043] In a preferred embodiment, intercepting the request includesintercepting a file system request submitted by the client for anoperation on the data resource, and wherein transmitting the messageincludes transmitting the file system request and a request for a lockvia the WAN from the proxy receiver to the proxy transmitter, andincluding:

[0044] responsive to the request for the lock, obtaining the lock fromthe file server at the proxy transmitter; and

[0045] conveying the lock over the WAN from the proxy transmitter to theproxy receiver.

[0046] Preferably, the method includes, if the proxy receiver interceptsno more file system requests from the client with respect to the dataresource for a certain period, issuing an unlock request from the proxyreceiver to the proxy transmitter with respect to the data resource.

[0047] In a preferred embodiment, intercepting the request includesintercepting the request for the data resource submitted in accordancewith a first native network file system of the client, and retrievingthe replica includes translating the request for the data resource fromthe first native network file system to a second native network filesystem used by the file server, and retrieving the replica of the dataresource using the translated request.

[0048] Preferably, conveying the replica of the data resource over theWAN includes ascertaining an available bandwidth of the WAN, andconveying the replica using a portion of the bandwidth that is less thana total available bandwidth, responsive to a management directivedownloaded to the proxy receiver over the WAN.

[0049] As appropriate, transmitting the message includes aggregating themessage into a batch of messages, and transmitting the aggregated batch.

[0050] In a preferred embodiment, the proxy transmitter is one of aplurality of proxy transmitters, and conveying the replica includesassessing an efficiency of conveying the replica over the WAN to theproxy receiver from each of at least two of the proxy transmitters, andselecting at least one of the proxy transmitters to convey the replicaresponsive to the assessed efficiency.

[0051] In this case, conveying the replica may include conveyingrespective portions of the replica from the at least two of the proxytransmitters, and concatenating the portions to create the replica atthe proxy receiver.

[0052] Preferably, conveying the replica includes:

[0053] checking a transmitter memory of the proxy transmitter todetermine whether the replica of the data resource is present in thetransmitter memory and valid; and

[0054] responsive to the message and to determining that the replica inthe transmitter memory is present and valid, conveying the replica fromthe transmitter memory over the WAN to the proxy receiver.

[0055] In this case, retrieving the replica of the data resource fromthe file server preferably includes retrieving the replica of the dataresource from the file server to the transmitter memory when it isdetermined that the replica of the data resource is not present in thetransmitter memory or is not valid.

[0056] Preferably, the method includes conveying to the proxy receivermetadata regarding the data resource on the file server and, responsiveto the metadata, presenting to the client a virtual directory of thefile server. Preferably, conveying the metadata includes reading themetadata from Files held by the file server using the proxy transmitter,and conveying the metadata from the proxy transmitter to the proxyreceiver.

[0057] Preferably, transmitting the message via the WAN includesencapsulating the message in accordance with a WAN transport protocoland transmitting the encapsulated message. Preferably, the WAN transportprotocol includes a Hypertext Transfer Protocol (HTTP).

[0058] Preferably, conveying the replica of the data resource over theWAN includes encapsulating the replica in accordance with a WANtransport protocol and conveying the encapsulated replica. Preferably,the WAN transport protocol includes a Hypertext Transfer Protocol (HTTP)and/or a Transmission Control Protocol (TCP).

[0059] Preferably, the request for the data resource is submitted by theclient using a call to a native network file system used by the fileserver, and retrieving the replica of the data resource includesretrieving the replica of the data resource using the native networkfile system. Optionally, the native network file system is selected froma group of file systems consisting of Network File System (NFS), CommonInternet File System (CIFS), and NetWare file system. Preferably,transmitting the message includes encapsulating the call to the nativefile system for transmission in accordance with a WAN transportprotocol.

[0060] Preferably, conveying the replica of the data resource includescompressing the replica at the proxy transmitter, conveying thecompressed replica over the WAN, and decompressing the compressedreplica at the proxy receiver. Preferably, compressing the replicaincludes applying delta compression at the proxy transmitter to thereplica responsive to information provided to the proxy transmitter bythe proxy receiver. Most preferably, applying delta compression includescorrelating the replica at the proxy transmitter with another version ofthe replica that is available at the proxy transmitter and at the proxyreceiver, and/or correlating the replica at the proxy transmitter withone or more resource blocks of one or more other resources that areavailable at the proxy transmitter and at the proxy receiver.

[0061] In a preferred embodiment, the method includes storing thereplica of the data resource in a memory of the proxy receiver, andserving the replica of the data resource from the proxy receiverincludes serving the replica of the data resource from the memory of theproxy receiver.

[0062] Preferably, the method further includes: intercepting a furtherrequest for the data resource from another client on the second LAN;checking the memory to determine whether the replica of the dataresource is present in the memory and valid; and responsive to thefurther request and to determining that the replica is present andvalid, serving the replica of the data resource from the memory of theproxy receiver to the other client over the second LAN.

[0063] Preferably, when the data resource is a file including aplurality of file blocks, conveying the replica includes analyzing apattern of access by the client to the file blocks, and conveyingreplicas of a portion of the file blocks not yet requested by theclient, responsive to the pattern.

[0064] In a preferred embodiment, the client is a first client among aplurality of clients on the second LAN, and serving the replica of thedata resource from the memory includes serving the replica both to thefirst client and to a second client among the plurality of clients.

[0065] Preferably, serving the replica includes periodically checking atthe proxy receiver whether the replica of the data resource in thememory of the proxy receiver is consistent with the data resource heldby the file server, and deleting the replica from the memory upondetermining that the replica is not consistent. Preferably, the methodadditionally includes deleting the replica from the memory responsive toa predetermined cache removal policy.

[0066] Preferably, conveying the replica of the data resource includesconveying a read lease relating to the data resource to the proxyreceiver, and serving the replica of the data resource includes servingthe replica so long as the read lease has not expired or been revoked bythe proxy transmitter. When the proxy receiver is a first proxy receiveramong a plurality of proxy receivers, the method preferably includesrevoking, at the proxy transmitter, the read lease conveyed to the firstproxy receiver if a second proxy receiver among the plurality of proxyreceivers modifies the data resource. Preferably, conveying the readlease includes setting an expiration period of the read lease responsiveto a file type of the data resource. Optionally, conveying the readlease includes locking the data resource at the file server, and themethod includes unlocking the data resource at the file server upontermination of the expiration period of the read lease.

[0067] Preferably, the method includes performing an operation on thereplica of the data resource in the memory responsive to a managementdirective downloaded to the proxy receiver over the WAN. Preferably, thedirective is encoded in a tag-based markup language, and performing theoperation responsive to the directive includes parsing the markuplanguage.

[0068] Preferably, intercepting the request includes intercepting agroup of one or more requests for first data resources on the fileserver, and the method includes analyzing a pattern of the group ofrequests, and retrieving replicas of one or more second data resourcesfrom the file server to the memory of the proxy receiver, responsive tothe pattern.

[0069] Preferably, retrieving the replicas of the one or more seconddata resources includes retrieving the second data resources before theclient requests the second data resources.

[0070] Preferably, analyzing the pattern includes calculating for eachof the second data resources on the file server a relation of anexpected usage of the replicas of the second data resources at the proxyreceiver to an expected modification rate of the second data resourcesat the file server.

[0071] Preferably, retrieving the replicas of the one or more seconddata resources includes analyzing a relation of an available bandwidthof the WAN to an expected usage of the replicas of the second dataresources at the proxy receiver, and determining, responsive to therelation, when to retrieve a replica of the second data resource.Alternatively or additionally, retrieving the replicas of the one ormore second data resources includes analyzing a first relation of anexpected usage of the replicas of the second data resources at the proxyreceiver to an expected modification rate of the second data resourcesat the file server, determining a second relation between an availablebandwidth of the WAN and the first relation, and determining, responsiveto the second relation, when to retrieve a replica of the second dataresource.

[0072] Preferably, retrieving replicas of the one or more second dataresources includes determining an order of retrieval of the second dataresources responsive to a predetermined retrieval policy, and conveyingthe replicas over the WAN in the determined order. Preferably, inaccordance with the retrieval policy, the first data resources requestedby the client are retrieved with a higher priority than the second dataresources.

[0073] In a preferred embodiment the method includes: intercepting atthe proxy receiver a write request submitted by the client forapplication to the data resource; transmitting the write request via theWAN from the proxy receiver to the proxy transmitter; and passing thewrite request via the first LAN from the proxy transmitter to the fileserver.

[0074] Sometimes, intercepting the write request includes interceptingmultiple write requests submitted by the client for application to thedata resource, and aggregating the write requests in a write memory ofthe proxy receiver, and transmitting the write requests includestransmitting the aggregated write requests together via the WAN from thewrite memory of the proxy receiver to the proxy transmitter.

[0075] When the data resource includes multiple separate data resourceitems, preferably aggregating the write requests includes aggregatingthe write requests with respect to the multiple data resources items soas to transmit the aggregated write requests together.

[0076] In a preferred embodiment, conveying the replica of the dataresource includes conveying to the proxy receiver a write lease relatingto the data resource, and transmitting the write request via the WANfrom the proxy receiver to the proxy transmitter includes transmittingthe write request via the WAN from the proxy receiver to the proxytransmitter upon expiration or revocation of the write lease.Preferably, conveying the write lease includes setting an expirationperiod of the write lease responsive to a file type of the dataresource. Optionally, conveying the write lease includes locking thedata resource at the file server, and the method includes unlocking thedata resource at the file server upon termination of the expirationperiod of the write lease. When the proxy receiver is a first proxyreceiver among a plurality of proxy receivers, and the method preferablyincludes revoking, at the proxy transmitter, the write lease conveyed tothe first proxy receiver if a second proxy receiver among the pluralityof proxy receivers conducts a file system operation on the dataresource.

[0077] Preferably, conveying the write lease includes checking aconnection status of the WAN, and determining whether to maintain thewrite lease responsive to the connection status. Preferably,intercepting the write request preferably includes receiving and holdingthe write request from the client at the proxy receiver while the WAN isdisconnected, and transmitting the write request includes transmittingthe write request when the WAN is reconnected, and including integratingthe write request with the data resource at the file server.

[0078] There is also provided, in accordance with a preferred embodimentof the present invention, a method for enabling access to a dataresource held on a file server on a first local area network (LAN) by aclient on a second LAN, the method including:

[0079] intercepting a request to perform a file operation on the dataresource submitted by the client, using a proxy receiver on the secondLAN;

[0080] checking a receiver cache held by the proxy receiver to determinewhether valid information necessary to fulfill the request is alreadypresent in the receiver cache;

[0081] responsive to the request and to determining that the validinformation is not present in the receiver cache, transmitting via awide area network (WAN) a message requesting the information from theproxy receiver to a proxy transmitter on the first LAN;

[0082] responsive to the message, conveying the information over the WANfrom the proxy transmitter to the proxy receiver; and

[0083] fulfilling the request at the proxy receiver to the client usingthe information.

[0084] The valid information may include the data resource and/ormetadata relating to the data resource.

[0085] In a preferred embodiment, the file operation is a metadata-onlyfile operation, and the information includes metadata.

[0086] In a preferred embodiment, the request for the data resource issubmitted by the client using a call to a native network file systemused by the file server, and transmitting the message via the WANincludes transmitting the message via the WAN using the native networkfile system.

[0087] Preferably, the method further includes:

[0088] intercepting a further request to perform an operation on thedata resource from another client on the second LAN;

[0089] checking the receiver cache to determine whether the validinformation if already present in the receiver cache; and

[0090] responsive to the further request and to determining that thevalid information is present, fulfilling the further request at theproxy receiver to the other client using the valid information.

[0091] Preferably, conveying the information includes checking atransmitter cache held by the proxy transmitter to determine whether thevalid information necessary to fulfill the request is already present inthe transmitter cache and, if so, conveying the information from thetransmitter cache over the WAN to the proxy receiver. Furtherpreferably, conveying the information includes, upon determining thatthe valid information is not present in the transmitter cache, fetchingthe information from the file server to the proxy transmitter, andconveying the fetched information over the WAN to the proxy receiver.

[0092] Preferably, conveying the metadata includes reading the metadatafrom files held by the file server using the proxy transmitter, andconveying the metadata from the proxy transmitter to the proxy receiver.

[0093] There is further provided, in accordance with a preferredembodiment of the present invention, a method for enabling access to adata resource, which is held on a file server on a first local areanetwork (LAN), by a client on a second LAN, the method including:

[0094] conveying a replica of the data resource over a wide area network(WAN) from the file server to a cache held by a proxy receiver on thesecond LAN;

[0095] intercepting at the proxy receiver a file system request for thedata resource submitted by the client over the second LAN;

[0096] checking the cache to determine whether the replica of the dataresource is present in the cache and valid; and

[0097] responsive to the file system request and to determining that thereplica is present and valid, serving the replica of the data resourcefrom the cache of the proxy receiver to the client over the second LAN.

[0098] In a preferred embodiment, the request for the data resource issubmitted by the client using a call to a native network file systemused by the file server.

[0099] In a preferred embodiment, the method also includes:

[0100] intercepting a further request for the data resource from anotherclient on the second LAN;

[0101] checking the cache to determine whether the replica of the dataresource is present in the cache and valid; and

[0102] responsive to the further request and to determining that thereplica is present and valid, serving the replica of the data resourcefrom the cache of the proxy receiver to the other client over the secondLAN.

[0103] In a preferred embodiment, the client is a first client among aplurality of clients on the second LAN, and serving the replica of thedata resource from the cache includes serving the replica both to thefirst client and to a second client among the plurality of clients.

[0104] In a preferred embodiment, intercepting the request includesintercepting a lock request submitted by the client for a lock on thedata resource, and conveying the replica over the WAN includestransmitting a lock message via the WAN from the proxy receiver to thefile server, requesting the lock, and including:

[0105] responsive to the lock message, issuing the lock at the fileserver;

[0106] conveying the lock over the WAN from the file server to the proxyreceiver; and

[0107] serving the lock from the proxy receiver to the client.

[0108] Preferably, the method includes, upon determining that thereplica is not present or not valid, requesting that the replica beconveyed again from the file server to the proxy receiver. Preferably,requesting that the replica be conveyed includes requesting that thereplica be conveyed using a native file network system of the fileserver.

[0109] In a preferred embodiment, the method includes intercepting atthe proxy receiver a write request submitted by the client forapplication to the data resource, and passing the write request over theWAN from the proxy receiver to the file server.

[0110] There is still further provided, in accordance with a preferredembodiment of the present invention, a method for enabling access todata resources held on a file server on a first local area network (LAN)by a client on a second LAN, the method including:

[0111] reading metadata from the file server using a proxy transmitteron the first LAN;

[0112] transmitting the metadata via a wide area network (WAN) from theproxy transmitter to a proxy receiver on the second LAN; and

[0113] based on the metadata, constructing at the proxy receiver adirectory of the data resources on the file server, for use by theclient in accessing the data resources.

[0114] Preferably, reading the metadata includes reading updatedmetadata from the file server subsequent to constructing the directory,and wherein constructing the directory includes synchronizing thedirectory with the file server responsive to the updated metadata.

[0115] Preferably, the metadata includes file attributes of the dataresources, which file attributes are stored in a directory object on thefile server, and reading the metadata includes reading the fileattributes from the directory object.

[0116] In a preferred embodiment, the data resources include files, andthe metadata includes file attributes that are stored in the files, andreading the metadata includes reading the file attributes from thefiles.

[0117] In a preferred embodiment, the method includes intercepting atthe proxy receiver a file system request with respect to one of the dataresources in the directory submitted by the client over the second LAN,and, responsive to the file system request, serving data from the one ofthe data resources from the proxy receiver to the client over the secondLAN.

[0118] In a preferred embodiment, intercepting the file system requestincludes intercepting a file operation request based on the metadata,and including fulfilling the file operation request at the proxyreceiver, and conveying a result of the fulfilled file operation requestto the client over the second LAN.

[0119] There is also provided, in accordance with a preferred embodimentof the present invention, a method for enabling access to a dataresource held by a file server, the method including:

[0120] submitting a first request via a wide area network (WAN) foraccess to the data resource from one or more sources able to receive thedata resource from the file server;

[0121] receiving a response from a first source among the one or moresources indicating that the first source cannot provide a valid replicaof the data resource;

[0122] caching a record indicating that the first source is unable toprovide the valid replica of the data resource; and

[0123] submitting a second request for access to the data resource to atleast a second source among the one or more sources, while avoiding,responsive to the cached record, sending the second request to the firstsource.

[0124] There is yet additionally provided, in accordance with apreferred embodiment of the present invention, a method for enablingaccess to a data resource, which is held on a file server on a firstlocal area network (LAN), by a client on a second LAN, the methodincluding:

[0125] intercepting a request for the data resource submitted by theclient, using a file system driver on the second LAN;

[0126] transmitting a message via a wide area network (WAN) from thefile system driver to a proxy transmitter on the first LAN, requestingthe data resource;

[0127] retrieving a replica of the data resource from the file server tothe proxy transmitter;

[0128] responsive to the message, conveying the replica of the dataresource over the WAN from the proxy transmitter to the file systemdriver; and

[0129] serving the replica of the data resource from the file systemdriver to the client over the second LAN.

[0130] There is still additionally provided, in accordance with apreferred embodiment of the present invention, apparatus for enablingaccess to a data resource, which is held on a file server on a firstlocal area network (LAN), by a client on a second LAN, the apparatusincluding:

[0131] a proxy transmitter, which is adapted to retrieve a replica ofthe data resource from the file server over the first LAN; and

[0132] a proxy receiver, which is adapted to intercept a request for thedata resource submitted by the client on the second LAN, and responsiveto the request, to send a message via a wide area network (WAN) to theproxy transmitter on the first LAN, requesting the data resource, thuscausing the proxy transmitter to convey the replica of the data resourceover the WAN to the proxy receiver, which serves the replica of the dataresource to the client over the second LAN.

[0133] There is further provided, in accordance with a preferredembodiment of the present invention, apparatus for enabling access to adata resource held on a file server on a first local area network (LAN)by a client on a second LAN, the apparatus including:

[0134] a proxy transmitter, which is adapted to hold the data resource;and

[0135] a proxy receiver, which includes a receiver cache, and which isadapted to intercept a request to perform a file operation on the dataresource submitted by the client on the second LAN, to check thereceiver cache to determine whether valid information necessary tofulfill the request is already present in the receiver cache, andresponsive to the request and to determining that the valid informationis not present in the receiver cache, to transmit a message requestingthe information via a wide area network (WAN) to the proxy transmitter,thus causing the proxy transmitter to convey the information over theWAN to the proxy receiver, which fulfills the request using theinformation.

[0136] There is yet further provided, in accordance with a preferredembodiment of the present invention, apparatus for enabling access to adata resource, which is held on a file server on a first local areanetwork (LAN), by a client on a second LAN, the apparatus including aproxy receiver, which includes a cache, the proxy receiver located onthe second LAN and adapted to retrieve a replica of the data resourcefrom the file server over a wide area network (WAN) to the cache, tointercept a file system request for the data resource submitted by theclient over the second LAN, to check the cache to determine whether thereplica of the data resource is present in the cache and valid, and,responsive to the file system request and to determining that thereplica is present and valid, to serve the replica of the data resourcefrom the cache to the client over the second LAN.

[0137] There is still further provided, in accordance with a preferredembodiment of the present invention, apparatus for enabling access todata resources held on a file server on a first local area network (LAN)by a client on a second LAN, the apparatus including a proxy receiverand a proxy transmitter, the proxy transmitter located on the first LANand adapted to read metadata from the file server, to transmit themetadata via a wide area network (WAN) to the proxy receiver on thesecond LAN, and wherein the a proxy receiver is adapted to construct adirectory, based on the metadata, of the data resources on the fileserver, for use by the client in accessing the data resources.

[0138] There is additionally provided, in accordance with a preferredembodiment of the present invention, apparatus for enabling access by aclient to a data resource held by a file server, the apparatus includinga proxy receiver for serving the resource to the client, wherein theproxy receiver is adapted to submit a first request via a wide areanetwork (WAN) for access to the data resource from one or more sourcesable to receive the data resource from the file server, and uponreceiving a response from a first source among the one or more sourcesindicating that the first source cannot provide a valid replica of thedata resource, to cache a record indicating that the first source isunable to provide the valid replica of the data resource, so thatresponsive to the cached record, the proxy receiver avoids sending tothe first source a second request for access to the data resource, whilesubmitting the second request to at least a second source among the oneor more sources.

[0139] There is also provided, in accordance with a preferred embodimentof the present invention, apparatus for enabling access to a dataresource, which is held on a file server on a first local area network(LAN), by a client on a second LAN, the apparatus including:

[0140] a proxy transmitter, which is adapted to retrieve a replica ofthe data resource from the file server over the first LAN;

[0141] a file system driver, which is adapted to intercept a request forthe data resource submitted by the client on the second LAN, andresponsive to the request, to send a message via a wide are network(WAN) to the proxy transmitter on the first LAN, requesting the dataresource, thus causing the proxy transmitter to convey the replica ofthe data resource over the WAN to the file system driver, which servesthe replica of the data resource to the client over the second LAN.

[0142] There is further provided, in accordance with a preferredembodiment of the present invention, a computer software product forenabling access to a data resource, which is held on a file server on afirst local area network (LAN), by a client on a second LAN, the productincluding a computer-readable medium, in which program instructions arestored, which instructions, when read by a first computer on the firstLAN, cause the computer to operate as a proxy transmitter, so as toretrieve a replica of the data resource from the file server over thefirst LAN, and which instructions, when read by a second computer on thesecond LAN, cause the second computer to operate as a proxy receiver, soas to intercept a request for the data resource submitted by the clienton the second LAN, and responsive to the responsive, to send a messagevia a wide area network (WAN) to the proxy transmitter on the first LAN,requesting the data resource, thus causing the proxy transmitter toconvey the replica of the data resource over the WAN to the proxyreceiver, which serves the replica of the data resource to the clientover the second LAN.

[0143] There is still further provided, in accordance with a preferredembodiment of the present invention, a computer software product forenabling access to a data resource held on a file server on a firstlocal area network (LAN) by a client on a second LAN, the productincluding a computer-readable medium, in which program instructions arestored, which instructions, when read by a computer on the second LAN,cause the computer to operate as a proxy receiver having a receivercache, so as to intercept a request to perform a file operation on thedata resource submitted by the client on the second LAN, and to checkthe receiver cache to determine whether valid information necessary tofulfill the request is already present in the receiver cache, andresponsive to the request and to determining that the valid informationis not present in the receiver cache, to transmit a message requestingthe information via a wide area network (WAN) to a proxy transmitter onthe first LAN, thus causing the proxy transmitter to convey theinformation over the WAN transmitter to the computer, which fulfills therequest using the information.

[0144] There is additionally provided, in accordance with a preferredembodiment of the present invention, a computer software product forenabling access to a data resource, which is held on a file server on afirst local area network (LAN), by a client on a second LAN, the productincluding a computer-readable medium, in which program instructions arestored, which instructions, when read by a first computer on the secondLAN, cause the computer to operate as a proxy receiver having a cache,so as to retrieve a replica of the data resource from the file serverover a wide area network (WAN) to the cache, to intercept a file systemrequest for the data resource submitted by the client over the secondLAN, to check the cache to determine whether the replica of the dataresource is present in the cache and valid, and, responsive to the filesystem request and to determining that the replica is present and valid,to serve the replica of the data resource from the cache to the clientover the second LAN.

[0145] There is yet additionally provided, in accordance with apreferred embodiment of the present invention, a computer softwareproduct for enabling access to data resources held on a file server on afirst local area network (LAN) by a client on a second LAN, the productincluding a computer-readable medium, in which program instructions arestored, which instructions, when read by a first computer on the firstLAN, cause the first computer to operate as a proxy transmitter, so asto read metadata from the file server, and to transmit the metadata viaa wide area network (WAN) to the second LAN, and which instructions,when read by a second computer on the second LAN, cause the secondcomputer to operate as a proxy receiver, and to construct a directory,based on the metadata, of the data resources on the file server, for useby the client in accessing the data resources.

[0146] There is further provided, in accordance with a preferredembodiment of the present invention, a computer software product forenabling access by a client to a data resource held by a file server,the product including a computer-readable medium in which programinstructions are stored, which instructions, when read by a computer,cause the computer to submit a first request via a wide area network(WAN) for access to the data resource from one or more sources able toreceive the data resource from the file server, so as to provide thedata resource to the client, and wherein the instructions further causethe computer, upon receiving a response from a first source among theone or more sources indicating that the first source cannot provide avalid replica of the data resource, to cache a record indicating thatthe first source is unable to provide the valid replica of the dataresource, so that responsive to the cached record, the computer avoidssending to the first source a second request for access to the dataresource, while submitting the second request to at least a secondsource among the one or more sources.

[0147] There is still additionally provided, in accordance with apreferred embodiment of the present invention, a computer softwareproduct for enabling access to a data resource, which is held on a fileserver on a first local area network (LAN), by a client on a second LAN,the product including a computer-readable medium, in which programinstructions are stored, which instructions, when read by a firstcomputer on the first LAN, cause the computer to operate as a proxytransmitter, so as to retrieve a replica of the data resource from thefile server over the first LAN, and which instructions, when read by asecond computer on the second LAN, cause the second computer to operateas a file system driver, so as to intercept a request for the dataresource submitted by the client on the second LAN, and responsive tothe request, to send a message via a wide are network (WAN) to the proxytransmitter on the first LAN, requesting the data resource, thus causingthe proxy transmitter to convey the replica of the data resource overthe WAN to the file system driver, which serves the replica of the dataresource to the client over the second LAN.

[0148] The present invention will be more fully understood from thefollowing detailed description of a preferred embodiment thereof, takentogether with the drawings, in which:

BRIEF DESCRIPTION OF THE DRAWINGS

[0149]FIG. 1 is a block diagram that schematically illustrates adistributed computer system including a Virtual File-Sharing Network(VFN) system, in accordance with a preferred embodiment of the presentinvention;

[0150]FIG. 2 is a block diagram that schematically illustrates a VFNsystem deployed on a WAN connecting several LANs, in accordance with apreferred embodiment of the present invention;

[0151]FIG. 3 is a block diagram that schematically illustrates detailsof a VFN gateway, in accordance with a preferred embodiment of thepresent invention;

[0152]FIG. 4 is a block diagram that schematically illustrates theprotocol architecture of a VFN system, in accordance with a preferredembodiment of the present invention;

[0153]FIG. 5 is a block diagram that schematically illustrates a VFNmanagement subsystem, in accordance with a preferred embodiment of thepresent invention;

[0154]FIG. 6 is a flow chart that schematically illustrates a method forrequesting an operation on a resource, in accordance with a preferredembodiment of the present invention;

[0155]FIG. 7 is a schematic illustration of a virtual directory, inaccordance with a preferred embodiment of the present invention;

[0156]FIG. 8 is a flow chart that schematically illustrates a method forrequesting a read operation, in accordance with a preferred embodimentof the present invention;

[0157]FIG. 9 is a flow chart that schematically illustrates a method forrequesting a write operation, in accordance with a preferred embodimentof the present invention;

[0158]FIG. 10 is a block diagram that schematically illustrates thedeployment of a VFN file agent, in accordance with a preferredembodiment of the present invention;

[0159]FIG. 11 is a block diagram that schematically illustrates detailsof a VFN gateway that relate to lock management, in accordance with apreferred embodiment of the present invention;

[0160]FIG. 12 is a block diagram that schematically illustrates detailsof a VFN application transport layer, in accordance with a preferredembodiment of the present invention;

[0161]FIG. 13 is a block diagram that schematically illustrates detailsof a client application transport layer, in accordance with a preferredembodiment of the present invention;

[0162]FIG. 14 is a flow chart that schematically illustrates a methodfor processing an RPC request by an RPC client, in accordance with apreferred embodiment of the present invention;

[0163]FIG. 15 is a block diagram that schematically illustrates detailsof a server application transport layer, in accordance with a preferredembodiment of the present invention; and

[0164]FIG. 16 is a flow chart that schematically illustrates a methodfor processing an RPC request by an RPC server, in accordance with apreferred embodiment of the present invention.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS SYSTEM OVERVIEW

[0165]FIG. 1 is a block diagram that schematically illustrates adistributed computer system 18 including a virtual file-sharing network(VFN) system 20, in accordance with a preferred embodiment of thepresent invention. The distributed computer system includes two or moregeographically-remote local area networks (LANs) 21 a and 21 b,interconnected through a wide area network (WAN) over an interconnection29. System 18 also includes at least one file server 25, located on LAN21 a, and at least one client 28, located on second LAN 21 b. The fileserver and client may use substantially any distributed file systemknown in the art, such as NFS, CIFS, or other file systems mentioned inthe Background of the Invention.

[0166] VFN system 20 comprises at least one VFN transmitter 52 connectedto file server 25 over LAN 21 a, and at least one VFN receiver 48connected to client 28 over LAN 21 b. The VFN transmitter and VFNreceiver communicate with one another over interconnection 29 providedby the WAN. The VFN transmitter and receiver are described in detailhereinbelow. Typically, the transmitter and receiver comprise standardcomputer servers with appropriate memory, communication interfaces andsoftware for carrying out the functions prescribed by the presentinvention. This software may be downloaded to the transmitter andreceiver in electronic form over a network, for example, or it mayalternatively be supplied on tangible media, such as CD-ROM.

[0167] In order to serve a resource held by file server 25 to client 28,VFN transmitter 52 fetches the resource from file server 25 andtransmits the resource over the WAN to VFN receiver 48, which thenserves the resource to client 28. Client 28 and file server 25 interacttransparently via their standard native network file system interfaces,without the need for special client or server VFN software. VFN receiver48 efficiently and transparently makes remote resources available toclient 28 by a combination of file replicating (“pre-positioning”) andcaching. Receiver 48 invokes on-demand retrieval when the requestedresource has not previously been pre-positioned or cached, or if thecached version of the resource has become outdated. Preferably, VFNsystem 20 provides end-to-end support for file sizes of at least up to 2gigabytes.

[0168] “WAN,” as used in the specification and the claims, is to beunderstood as a geographically dispersed network connecting two or moreLANs. Many different WAN configurations are possible, including WANsusing dedicated leased lines, permanent virtual circuits (such as framerelay links), virtual private networks (VPNs) (which typically operateover the public Internet), and/or satellite links. A WAN sometimescomprises an intranet (a private network contained within an enterprise,which uses Internet protocols) and/or an extranet (part of an intranetthat has been extended to users outside the enterprise). “WAN” is alsoto be understood as comprising the public Internet. “Resource,” as usedin the specification and the claims, is to be understood as including,but not being limited to, files, content, directories, and filemetadata.

[0169]FIG. 2 is a block diagram that schematically illustrates computersystem 18 deployed over WAN interconnections 29, in a accordance with apreferred embodiment of the present invention. The WAN interconnectionsconnect several LANs 21 a, 21 b and 21 c, which are referred togenerically as LAN 21. Typically, the VFN system is deployed on numerousLANs connected by a topologically-complex WAN. For the sake ofsimplicity of illustration, however, and without loss of generality,only three LANs connected by a simple WAN are shown in FIG. 2. Each LAN21 includes a VFN gateway 22, which typically comprises its own VFNtransmitter 52 and VFN receiver 48. The VFN transmitter and VFN receivercan run on the same physical host, or on different hosts. Alternatively,a VFN gateway can include only a VFN transmitter or a VFN receiver, inthe manner shown in FIG. 1. VFN gateways 22 communicate with one anotherover interconnection 29 provided by their respective WAN gateways 24.The WAN gateways can comprise any combination of VPN gateways, routers,repeaters, bridges, switches, gateways or other means of connecting LANsinto a WAN, as are known in the art.

[0170] The VFN transmitter of each VFN gateway fetches resources from atleast one file server 25 on its respective LAN, and transmits theseresources to one or more VFN receivers located in other VFN gateways.For example, as shown in FIG. 2, VFN transmitter 52 a transmitsresources to VFN receivers 48 b and 48 c. Likewise, a VFN receiver canreceive resources from more than one VFN transmitter. While LANs 21 areshown as having only one file server each, the LANs can have more thanone file server from which their respective VFN transmitters fetchresources. The file servers may run the same distributed file system or,alternatively, different file servers may run different file systems,all of which are accessed by the VFN gateways. Additionally, each LANcan include one or more Web/FTP servers 26 from which the VFNtransmitters fetch and transmit resources, as well.

[0171]FIG. 3 is a block diagram that schematically illustrates detailsof VFN gateway 22, in accordance with a preferred embodiment of thepresent invention. The illustrated VFN gateway includes both a VFNtransmitter 52 and a VFN receiver 48; however, as noted above, thetransmitter and receiver functions of the VFN gateways are essentiallyseparate, and a VFN gateway may therefore be configured to include onlya VFN transmitter or a VFN receiver, and not both. The functional blocksthat make up gateway 22 are typically implemented as softwarecomponents, which run together on the same computer processor.Alternatively, different functional blocks of gateway 22 may beseparated and run on different processors.

[0172] VFN transmitter 52 comprises a transmitter application layer 42,which provides services for, and control over, access to localinformation repositories, such as file servers 27 and 31 (collectivelyrepresented by file servers 25 in FIG. 2) and optionally Web/FTP servers26. Services provided by the transmitter application layer includeaccess to and transfer of shared resources, scheduled crawling,synchronization with remote copies, authentication and authorization,and resource usage tracking for various purposes, including billing.Optionally, VFN transmitter 52 comprises a cache 77. In this case, whena VFN receiver requests a resource for which the VFN transmitter holds avalid cached copy, the VFN transmitter serves the resource from itscache rather than first requesting a copy of the resource from itsorigin file server 25. Alternatively or additionally, when a VFN gatewaycomprises both a VFN receiver and a VFN transmitter, the VFN receiverand VFN transmitter may comprise a shared cache (which optionally is inaddition to independent caches), which may provide more efficientresource sharing and/or improved management, and support loop-backaccess, as described below.

[0173] VFN transmitter 52 further comprises a repository connector layer50, a software component which comprises one or more clients. Theseclients access resources on file servers 27 and 31 using the nativenetwork file system protocol of each file server. For illustrativepurposes, repository connector layer 50 is shown to include an NFSclient 62, for accessing resources stored on NFS file server 27, and aCIFS client 64, for accessing resources stored on CIFS file server 31.Alternatively or additionally, repository connector 50 includes clientsfor accessing other network file systems or sources of resources, suchas e-mail servers. Repository connector 50 may additionally comprise anHTTP/FTP client 66 that accesses resources stored on Web/FTP server 26,using standard HTTP and/or FTP protocols. Preferably, client 50 supportsthe Secure Sockets Layer (SSL) for connecting to Web sites using HTTPS.VFN receiver 48 preferably records the type of server from which eachresource originates, in order to apply the appropriate level ofconsistency, as described below.

[0174] VFN receiver 48 comprises a receiver application layer 40, whichprovides services to one or more local clients 28 by effectivelyfetching and maintaining local copies of remote resources in a cache 76.VFN receiver 48 further comprises an interception layer 54, whichcomprises servers that intercept local clients' requests for resourcesheld on remote servers, such as servers 26, 27 and 31 on remote LANs.Interception layer 54 communicates these requests to receiverapplication layer 40, which fulfills them with cached data, if possible,or by obtaining the resources from a remote VFN transmitter 52. Forillustrative purposes, interception layer 54 is shown as including anNFS server 56, for intercepting requests to remote NFS servers; a CIFSserver 58, for intercepting requests to remote CIFS servers; and an HTTPserver 60, for intercepting requests to remote HTTP servers.Alternatively or additionally, interception layer 54 may include serversfor intercepting requests to other remote servers or sources ofresources, such as other network file systems, FTP servers, or e-mailservers.

[0175] Optionally, VFN gateways 22 perform cross-file-system protocoltranslation, so that a client 28 running one file system protocol mayaccess resources on a remote file server 25 running a different filesystem protocol. In implementations that do not support suchcross-protocol translation, interception layer 54 typically includesonly server types corresponding to the client types included inrepository connector 50. In implementations that support suchcross-protocol translation, server and client types do not necessarilycorrespond. Although interception layer 54 is shown conceptually as aseparate component in FIG. 3, this separation is solely for purposes ofclarity of illustration only. Preferably, the servers included ininterception layer 54 are integrated into receiver application layer 40and run in the same process as the application layer.

[0176] VFN transmitter 52 and VFN receiver 48 each comprise anadaptation layer 45, which ensures reliable and efficient use ofavailable WAN bandwidth for transfer of files between VFN gateways. Theadaptation layer communicates with an application transport layer 46,which provides services for activation of remote services and inter-VFNgateway communication. The remote services are used by adaptation layer45 and the higher transmitter and receiver application layers, asdescribed in detail hereinbelow. Preferably, application transport layer46 provides inter-VFN gateway communication services over the WANthrough VFN HTTP servers 78, which are connected to WAN gateways 24.

[0177] When VFN transmitter 52 and VFN receiver 48 reside in the samehost, they preferably share a single VFN HTTP server 78. Preferably HTTPserver 60 and VFN HTTP server 78 are Apache servers. Alternatively, thecommunication function of VFN HTTP server 78 is performed by a non-HTTPserver, using another network protocol, such as FTP.

[0178] VFN HTTP servers 78 additionally communicate with a VFN managerto download configuration settings and directives, as shown anddescribed below with reference to FIG. 5. VFN transmitter 52 and VFNreceiver 48 each comprise a control agent 36, which implementsdirectives periodically downloaded from the VFN manager. The controlagents also collect activity data, which is used by the VFN manager forvarious activity analyses and reports.

[0179] VFN transmitter 52 and VFN receiver 48 further comprise a leasemanager 44 and lease client 38, respectively, for managing leases usedto implement the VFN system's consistency protocols. These protocols aredescribed below with reference to FIGS. 8 and 9.

[0180] Reference is now made to FIG. 4, which is a block diagram thatschematically illustrates the protocol architecture of VFN system 20, inaccordance with a preferred embodiment of the present invention. Thisfigure provides a different perspective on the elements of system 20,and particularly of gateway 22, that are shown in FIG. 3. The threelowest layers of the architecture are a network transport layer 70, anetwork layer 72, and a data link (or MAC) layer 74, which is anabstraction of the WAN and/or LAN. These layers are preferablyimplemented using standard LAN and Internet protocols, such asTransmission Control Protocol/Internet Protocol (TCP/IP) and/or UserDatagram Protoco/Internet Protocol (UDP/IP). Client 28, which isrepresented as an application layer entity, typically comprises astandard network file system client, such as an NFS or CIFS client,and/or a standard Web/FTP client. Likewise, the application layer offile server 25 comprises a standard network file server or Web/FTPserver. (File server 28 optionally includes a VFN file agent, asdescribed below with reference to FIG. 10.)

[0181] The application layers of VFN transmitter 52 and VFN receiver 48are divided into lower and upper layers. The upper layer comprisestransmitter application layer 42 and 30 receiver application layer 40.The lower layer provides communication services to the upper layer, andcomprises adaptation layer 45 and application transport layer 46, whichcommunicate over the WAN. The lower application layer also includes theLAN-facing components of the VFN transmitter and VFN receiver:repository connector layer 50 and interception layer 54, respectively.

[0182] Although the protocol architecture shown in FIG. 4 is based onstandard LAN and Internet protocols, the VFN application layers maysimilarly be adapted to work over network protocols of other types. Forexample, VFN system 20 may be configured, as well, to operate overcellular packet data networks and/or wireless LANs. In such embodiments,the VFN receiver protocol is preferably adapted to enable mobile usersto automatically discover and connect to the closest VFN receiver.

[0183] The VFN receiver and VFN transmitter preferably run over the Sun®Solaris™ Version 2.7 or 2.8 operating system. Preferably, receiverapplication layer 40 and transmitter application layer 42 are written inJava™ and run on a Java2 Virtual Machine, such as JRE 1.3. Whereappropriate, Java™ Native Interface (JNI) calls are preferably used toprovide file system functionality not included in Java's reducedcross-platform file access capabilities. Preferably, NFS server 56supports multiple versions of NFS, including NFS version 2, and variousdifferent mount protocols, as are known in the art.

[0184] Security for the cache, file metadata, and configuration isprovided by password encryption of all files. Additionally, when the VFNsystem is deployed on UNIX servers, protection is also provided throughfile server user access rights. Preferably, file system users of a VFNreceiver are given access only to cached file system resources, and notto cached HTTP resources.

VFN Management Subsystem

[0185]FIG. 5 is a block diagram that schematically illustrates a VFNmanagement subsystem 33, in accordance with a preferred embodiment ofthe present invention. The VFN management subsytem comprises a VFNmanager 30 and one or more manager consoles 32, which enableadministrators to remotely configure and define policies for VFNgateways. VFN manager 30 communicates with VFN gateways through controlagents 36 in each VPN gateway 22. Control agents 36 access receiver andtransmitter application layers 40 and 42 for data or control.

[0186] Preferably, VFN management subsystem 33 centrally controls,configures, and manages all VPN gateways and administers the VFNsystem's policy control mechanism. Alternatively, the VFN gateways maybe controlled and configured using a distributed approach, such as apeer-to-peer approach. Alternatively or additionally, the VFN systemsupports local administration of some or all components and/or policies.For example, certain locally-defined and mostly static configurationparameters, such as proxy host names, may be defined in the localconfiguration of the VFN gateways.

[0187] Preferably, the behavior of specific VFN gateways can be furthercustomized by the use of an Application Program Interface (API) providedby the VFN management subsystem, which is exposed to externalapplications 34. The API is preferably Java-based. For example, a VFNgateway can be customized to treat a set of resources atomically, sothat upon the invalidation of any member of the set, fresh copies of allother members of the set are also fetched.

[0188] VFN manager 30 maintains a database or configuration filecontaining configuration information and policies (“directives”) foreach VFN gateway. Directives are translated by a component in the VFNmanager into a tag-based markup language for storage in the VFNmanager's database. The VFN management subsystem includes a utility forconnecting and disconnecting VFN transmitter mount points to origin fileservers. This utility is run remotely, through the VFN manager, ordirectly on control agent 36 of VFN transmitter 52. The location of theutility is preferably configured responsive to management policies ofthe enterprise, such as whether distributed or centralized control isdesired. Preferably, VFN transmitters allow remote querying of availablemount points for administrative purposes, for example, for creating anew link between a VFN receiver and a mount.

[0189] Manager console 32 is an administrative tool that enablesadministrators to create VFN gateways and define directives. Preferably,resources are explicitly registered with the VFN system by anadministrator. Registered resources are preferably identified by a pathcomprising the origin file server name and IP address, and the share ormount point name. An administrator can register the resources on anentire origin file server or limit the registration to resources onspecified server shares. Each manager console controls multiple VFNgateways. The manager consoler preferably provides an integrated view ofthe VFN system topology, state (including system and componentconfiguration), monitoring (including operational characteristics),statistics, and directives. Manager console 32 preferably comprises aninteractive visual site explorer, similar to the site mapper describedabove, that browses resources on HTTP servers 78 embedded in VFNtransmitters 52 for resource listing.

[0190] When it is necessary to traverse firewalls, the site mapperpreferably accesses remote file system contents by communicating with asite explorer agent in a VFN transmitter local to the remote filesystem. The agent performs the traversal locally. Such communication isperformed using adaptation layer 45. Alternatively, manager console 32communicates directly with the site explorer agent using HTTP, whenfirewalls do not block such direct communications. In order to accessthese HTTP servers 78, the console contains an HTTP client, which hasaccess to all VFN transmitter components.

[0191] Preferably, VFN management subsystem 33 enables remote monitoringof the activity of VFN gateways. VFN manager 30 monitors the state ofeach VFN gateway, and the VFN gateways periodically ping the VFNmanager. Manager console 32 uses this information to visually indicatewhich VFN gateways are active and inactive. Logs are generated by eachVFN gateway, including information about the gateway's state, load, filerequest distribution and access records (such as request URL, VFNtransmitter, and VFN receiver return codes, and roundtrip times), cachestatistics (such as cache quotas and allocations), error statistics, andunused replications. These logs are periodically uploaded to the VFNmanager, either at defined intervals or when free-storage capacity inthe VFN receiver reaches a defined limit. The VFN manager uses theselogs to generate statistical reports, using utility programs invoked bya VFN administrator. A VFN administrator can view these logs andstatistical reports using the manager console. This information is alsoused as an input into the pre-positioning algorithms, describe below.

[0192] The generation of each log type is independently enabled by themanager console, and the VFN receivers collect and upload logsindependently from one another. Logging, except error logging, may bedisabled by a VFN administrator.

[0193] VFN manager 30 and manager console 32 preferably provide remotecontrol of installed system components, including start, stop, andrestart. Additionally, the manager console preferably provides clearerror notifications. The VFN system optionally supports externalnotification of errors, for example by e-mail.

[0194] Preferably, there are two kinds of users of the manager console:administrators and policy editors (referred to herein collectively as“VFN administrators”). Administrators can create new VFN gateways anddefine management directives that apply to an entire VFN gateway. Policyeditors can only define service directives that apply to certainresources. Preferably, the manager console provides means forcontrolling the access of different VFN administrators to different VFNgateways. Additionally, the manager consoler preferably providesautomatic conflict resolution when conflicting directives are generatedby either the same or different VFN administrators.

[0195] The control agent in each VFN receiver periodically automaticallydownloads its specific remote configuration information and directivesfrom the VFN manager. Downloads are preferably done using HTTP. Toenhance security, preferably HTTP authentication and SSL are used. If achange in directives is detected, the VFN receiver downloads, parses,and integrates the modified set into the running VFN receiver. The VFNreceiver then activates the services specified. Generally, mostdirectives are activated on a time schedule by the VFN receiver. Severaldirectives may be activated in parallel, agnostic to one another. If anerror occurs during download or parsing, the VFN gateway disregards thenew service set and continues to use the previous set until the nextdownload period. This policy is intended to ensure a consistent view ofthe service set.

[0196] Preferably, VFN management subsystem 33 can invoke a system resetoperation, which instructs VFN receiver 48 to reset all or part of itscomponents, including their state, information, and/or directives. Whena reset operation is performed, the VFN receiver reloads the currentinitial state from the VFN manager. Some VFN receiver components mayadditionally reread and process their local configuration parameters.The reset operation is parameterized by a discrete activation time, andaccepts a service-specific parameter for the type of reset requested,including: all, directives, and cache (reset the cache data andmetadata, losing all cached resource information).

[0197] Typically, VFN manager 30 runs over Sun® Solaris™ 2.7 or 2.8, anduses a standard HTTP server, preferably Apache. The configurationdatabase is preferably a SQL server database, such as MySQL. Preferably,applications 34 for the VFN manager are coded in CGI scripts or Perl.The VFN manager may either be deployed on a dedicated host or on thesame host as a VFN receiver and/or VFN transmitter. To enhance security,VFN manager 30 may use a port other than the standard port 80 for HTTPaccess to gateways 22. Secure communication lines are preferably usedwhen the VFN manager or manager console are operated from a remotelocation.

[0198] Manager console 32 is typically a single-user application thatruns on a Windows NT or Windows 2000 system. Alternatively oradditionally, the manager console is a browser-based client, whichprovides support for remote administration. Manager console 32preferably typically includes an FTP client, which is used forretrieving policy directive information from the database held by theVFN manager. Before conveying the stored directives to the managerconsole, the VFN manager preferably converts the directives into XMLform, so that they can be easily read and edited by the user of themanager console. Manager console 32 then publishes user-defineddirectives to the VFN manager, either according to a preset schedule orpursuant to an explicit user command. VFN management system 33preferably provides for safe changes in the event a configurationsession is prematurely terminated. Configuration backup and restore froma remote location is preferably supported, as well.

[0199] Directives

[0200] In the context of the present patent application and in theclaims, a directive is a combination of conditions that, uponsatisfaction, causes a predefined action to be executed in a VFNgateway, overriding the default VFN gateway behavior. Directives areeither defined by a VFN administrator, as described above, or, undercertain circumstances, automatically and/or adaptively generated. Forexample, directives can be automatically generated by an externalapplication through an API provided by the VFN system. Preferably, newdirectives are adaptively generated and/or existing directives areadaptively modified by a VFN transmitter or VFN receiver that detectsaccess patterns in real time. Directives include system-wideconfiguration parameters, actions to be carried out by a specific VFNreceiver (for example, pre-position all files under a directory), andinformation relating to resources shared between the VFN gateway sites(for example, the expected change frequency of resources). Directivesmay be defined for an entire VFN system, a single VFN gateway, or agroup of VFN gateways. VFN gateway groups provide a logical view ofrelated VFN gateways and make policy definitions easier to manage thanon a per-VFN-gateway basis. The grouping criteria are defined by a VFNadministrator and can include, for example, geographical location,business functions, and/or expected resource usage patterns.

[0201] Directives preferably have three types of parameters: content,time, and, for HTTP-related directives, the presence and/or value ofcertain HTTP headers. Directives may include context-sensitive values.

[0202] The content parameter specifies one or more files or directories,specified as fully qualified Uniform Resource Locators (URLs) orpatterns on which the directive should operate. Elements may bespecified manually or via the interactive visual site explorer mentionedabove. A URL pattern specification preferably includes a scheme (HTTP orFTP), a hostname, a path, and an optional file name.

[0203] There are two broad types of time directives: discrete andcontinuous. Discrete directives perform an action at a specific time,while continuous directives operate over an interval of time. Forexample, a directive for pre-positioning resources is typically discretebecause it specifies when to perform the pre-position activity. Incontrast, cache policy directives are typically continuous because theydefine a period during which certain caching policies are applied to aspecified resource. Preferably, the default value for a discrete timedirective is “now”.

[0204] Recurrence is a time property that can be applied to alldirectives. For example, discrete-time directive, such as forpre-positioning, can be activated every day at midnight. Similarly, acontinuous-time directive, such as for a cache policy, can be activatedevery day between 9:00 a.m. and 5:00 p.m. Preferably, the recurrencegranularity ranges from minutes (smallest) to years (largest).

[0205] For HTTP-based content, directives can be further parameterizedto evaluate the values of multiple HTTP request headers. Any HTTP headermay be specified and its value matched against a pattern expression.

[0206] Directives that can be defined preferably include:

[0207] Pre-position, which is used to control and manage resourcepre-positioning from VFN transmitters to remote VFN receivers. Thedirective specifies which resources should be pre-positioned and when.Pre-positioning candidates include infrequently changing, largeresources that are likely to be in demand at the remote site.Preferably, pre-positioning candidates are additionally selected usingusage profiling generated from information collected by resource usagetracking, as described above with reference to FIG. 3.

[0208] Cache consistency policy, which allows customization of the VFNreceiver cache resource addition, removal, and revalidation policies.This directive can specify explicit rules for including or excludingresources and/or resource sets from the cache, for setting theirrevalidation period and general consistency level, and for setting theircaching priority class and replacement policy. For directives thatoperate on cached resources, a parameter is preferably included thatspecifies to which type of cached resources the directive applies:“sticky” or “normal,” as described below, or “don't care,” whichindicates that the directive operates on both “sticky” and “normal”cached resources.

[0209] Active refresh, which is used to update resources which arecached in a VFN receiver, and to remove resources from a VFN receivercache 76 that no longer exist on the origin site.

[0210] Active invalidate, which is used to mark resources in a VFNreceiver cache 76 as invalid (soft invalidation) or explicitly removeresources from a VFN receiver cache (hard invalidation). This directiveexplicitly ensures freshness of remote copies, overriding the cache'sinternal policies and heuristics.

[0211] URL translation (applies to HTTP resources only), which applies atranslation rule to requested URLs. When a URL is requested for which aURL translation is defined, the URL resulting from applying thetranslation rule will be returned.

[0212] Request modification (applies to HTTP resources only), whichapplies a modification rule to HTTP requests by setting HTTP requestheader values.

[0213] Reset component, which selectively resets components of a VFNgateway.

[0214] Logging policy, which enables a VFN administrator to control thegranularity and type of reporting produced by VFN gateways, samplingrates for monitoring and statistics, the upload schedule, how much diskspace is allocated for each type of reporting, and the target upload URL(which can be a preconfigured CGI script).

[0215] Preferably, the default content parameter value is “all” forcache priority, active update and invalidation, and there is no defaultfor other directives.

[0216] Some directives carry additional directive-specific parametersrequired for their effective and successful application. For example,pre-positioning directive parameters preferably include one or more URLsor URL patterns, directory depth (how many levels of sub-directories toexplore and preposition), and/or a set of discrete time values forscheduled pre-positioning. Optionally, the VFN transmitter crawler(described below) automatically generates a list of URLs for a specifiedroot URL by traversing the tree of the root URL In addition to directlyspecifying the list of resources, the parameters of the pre-positioningdirective can alternatively specify a URL containing a list of resourcesto be pre-positioned. Parameters of pre-positioning directives may alsoinclude constraints, such as limitations on the overall bandwidthallowed at a given time or the maximum number of concurrent connectionsallowed to be opened when attempting to fulfill the directive.

[0217] Pre-positioning directives preferably include two additionalparameters: archive and authorize. Resources tagged with the archiveparameter are archived by the VFN transmitter's archiver, as describedbelow. The authorize parameter applies only to HTTP resources. When suchresources are tagged with this parameter, the VFN receiver requestsauthorization from the VFN transmitter before allowing user clients toaccess such resources.

[0218] String patterns may be used for content, header anddirective-specific parameters. Supported string-pattern-matchingoperators preferably include is, is-not, contains, does-not-contain,starts-with and ends-with.

Transmitter and Receiver Application Layers

[0219] VFN System Metadata

[0220] VFN system 20 creates, stores, and maintains metadata (“VFNmetadata”) for all resources registered with the system. (VFN metadatais distinct from file metadata, as explained below with reference toFIG. 7.) VFN metadata preferably includes:

[0221] the identify of the resource owner, which is a VFN transmitter;

[0222] the identity of at least one VFN gateway—not necessarily theresource owner—that holds the current version of the resource;

[0223] The resource local state (fully or partially available, localversion held, freshness of local version, local usage statistics);

[0224] computed signatures, which are used as file version identifiers.For example, a computed signature may be calculated from a resource'si-node number, creation and last modification time, or by applying acryptographic hash to the content of the resource;

[0225] access lists, as described below;

[0226] locking status, as describe below;

[0227] usage statistics, as describe above;

[0228] version and change records between versions; and

[0229] associated volume, if any, as described below.

[0230] VFN metadata is stored hierarchically in an upper level resourcedirectory at its owner VFN transmitter, which is responsible formaintaining the most recent VFN metadata for the resource. Any changesmade to a resource by a holder other than the owner must be reported tothe owner. The hierarchical structure of the VFN metadata resourcedirectories allows each VFN gateway to navigate the directory structure,fetch VFN metadata, and assemble each resource from its owner or owners.

[0231] By default, the owner of a file or directory resource is the VFNtransmitter where the resource is first registered with or created inthe VFN system. The owner learns of the existence of a resource byscanning the resources of a local file server using a crawler, asdescribed below, or by discovering a new resource in a local file systemfollowing a client request for a local directory. Additionally, theowner learns of a new file when the creation of the file by a userclient is intercepted by a file server in interception layer 54.

[0232] Optionally, the owner and/or holder may be changed manually by aVFN administrator or changed automatically based on directives. Forexample, changing the owner may improve efficiency when a resource ismodified extensively at a gateway other than the owner gateway, or whenpolicies preclude certain gateways from serving as owners and/or holdersbecause of reliability concerns. Optionally, the new owner is a VFNreceiver, which is granted exclusive access to the resource. Such achange of owner becomes effective only when the parent directory, whichcontains the resource, approves this change by recording the new ownerand updating the VFN metadata. Similarly, policies can stipulaterestrictions on which gateways can be owners and/or holders, including,for example, a restriction that an owner must be the holder of itsresources.

[0233] Preferably, before a VFN gateway that is not authorized to be aholder can change a resource, the change must be replicated andauthorized by the resource owner. If an unauthorized local change ismade by such a gateway, the modified resource is preferably stored in alocal overflow buffer, and a conflict is reported to the managementsubsystem. Preferably, such conflicts are resolved manually (forexample, merged by a user), or automatically by resource-type-specificprocedures designed to handle specific conflicts.

[0234] Each resource is identified within the VFN by a unique VFNresource handle. The handle includes the identity of the resource owner,the directory path that leads to the resource, and a unique identifierwithin its directory. Preferably, the VFN system-managed name space isconsistent with the native name space. Alternatively, the VFN system mayprovide a global name space.

[0235] Access lists are used to determine the clients of VFN system 20that are entitled to access a given resource. Such access lists can bedefined using native network file system hosts and user names, or by aVFN administrator using VFN access groups. These VFN access groups areglobal group identities that are mapped to local identities in each VFNgateway. Such access lists may be useful when the VFN system is deployedas an extranet across multiple organizations or across more than one WANwithin an organization. Preferably, when VFN access lists differ fromtheir corresponding native file system access lists, access permissionis mapped from the native file system access lists to the VFN accesslists, most preferably using the user names or IDs of the native filesystem. Access permissions are checked as appropriate for the protocol,on either the VFN transmitter or VFN receiver, prior to or aftertranslation. Changes in permission are reflected across the securitydomains.

[0236] Each resource can be identified as part of a volume, which is aset of resources. Volumes can be defined using logical expressions,including inclusion and exclusion filters and operators, applied todirectory, file name, and attribute information. Directives may beapplied to individual resources, recursive directories, and/or tovolumes.

[0237] In addition to VFN metadata, each VFN gateway maintains a recordof up-to-date files and file blocks locally available in its cache,together with the original version and timestamp attributes of eachfile. This record is referred to hereinafter as the “locally availableresources,” or “LAR”.

[0238] Preferably, LAR information is replicated between neighboring VFNgateways. This replication occurs periodically, and, in certain cases,on demand. Information regarding small locally available resources (forexample, resources with sizes less than 256 kilobytes) is preferably notreplicated, in order to maximize efficiency. The LAR informationincludes a small number of attributes that uniquely identify the LARresource with respect to its VFN metadata.

[0239] By replicating LAR information, the VFN system maintains at eachVFN gateway information regarding the availability of resources atnon-owner and non-holder VFN gateways. This information can be used byVFN gateways to access resources over alternate routes or in parallelfrom multiple VFN gateways, as described below. Because LAR informationis typically replicated only for large resources, and the LARinformation includes only a small number of attributes, the size of LARfiles generally remains small, even in large VFN systems. This smallsize facilitates a thorough replication of LAR information using minimalWAN bandwidth.

[0240] Repository Plug-in API

[0241] The repository plug-in API is a layer in transmitter applicationlayer 42 that provides an abstraction of the access mechanism tomultiple repositories, such as NFS, CIFS, HTTP, and FTP. The plug-inhides the details of the implementations of these various repositoriesfrom the transmitter application layer. It also provides transmitterapplication layer 42 with a consistent repository interface that handlesfunctions such as name traversal, locking, read, write, and listing.

[0242] File Server Operations

[0243] Each of the file servers in interception layer 54 (FIG. 3)support the file server operations provided by the corresponding nativefile server 25. Preferably, the interception layer file servers supportall of the corresponding file server operations, including block-levelreading and writing. This support is desirable to enable VFN receiver 48to transparently act as a file server for registered remote resources.When a request for an operation is received by a file server ininterception layer 54 from a user client 28, VFN receiver 48 parses therequest and determines whether the resource is present in its localcache 76. If so, the file server in the interception layer serves therequested resource directly to the client.

[0244] If the resource is absent from the cache, VFN receiver 48 passesthe request via WAN gateway 24 to the appropriate VFN transmitter 52,preferably using an internal VFN API that is common to all supportednetwork file systems, including NFS and CIFS. The clients in repositoryconnector layer 50 in VFN transmitter 52 issue requests to the nativefile servers 25, and transfer the results, over the WAN, to the VFNreceiver, which passes the response back to user client 28.

[0245] For network file systems that support mounting (such as NFS), theVFN system supports natural integration of file servers in interceptionlayer 54 with users' local file systems through mount points (local filesystem locations on users' systems where mounted file system directoriesare attached). Preferably, multiple mount points are supported, andthere can be multiple client mounts on any sub-directory of any mount.These mount points are associated by the VFN receiver's localconfiguration file with paths in the directory structure of the VFNtransmitter. The VFN receiver preferably enforces configuration settingsspecifying which mounts are accessible to each VFN receiver. Typically,mounting does not require credentials because it piggybacks the firstuser request for a resource on a file serve. Alternatively, for VFNtransmitter-initiated activity, the VFN transmitter possessescredentials that allow access to file server shares and resources,thereby enabling “context-free” (with respect to user credentials)access.

[0246] The VFN system preferably supports global file system operationssuch as querying free size and quotas. Either the correct origin sitevalues are reflected, or synthetic values are generated whereappropriate.

[0247]FIG. 6 is a flow chart that schematically illustrates a method forrequesting an operation on a resource, such as a file, in accordancewith a preferred embodiment of the present invention. The methodillustrated in FIG. 6 is general and does not include application ofconsistency protocols, which are described below with reference to FIGS.8 and 9. This method is used whenever a client 28 requests an operation(such as open, read, write, or close) on a resource R registered withthe VFN system and held by a remote file server 25, at a resourcerequest step 100. The resource request is intercepted by interceptionlayer 54 of VFN receiver 48 of the VFN gateway (GW1) that resides on theclient's LAN, at an interception step 102. The VFN receiver checkswhether a valid replica of resource R is stored in cache 76 of the VFNreceiver of GW1, at a GW1 cache check step 104. If R is present in thecache, the VFN receiver permits the resource request to proceed, at areply step 118.

[0248] On the other hand, if a valid replica of resource R is not storedin the cache of the VFN receiver of GW1, the VFN receiver forwards therequest for a replica of resource R, over WAN 29, to VFN transmitter 52of the remote VFN gateway (GW2) that is the owner of resource R, at aremote request step 106. The remote VFN transmitter checks whether avalid replica of resource R is stored in the cache of GW2, at a GW2cache check step 108. If so, the VFN transmitter permits the resourcerequest to proceed, at a remote resource transfer step 114. On the otherhand, if a replica is not available in GW2, the appropriate file systemclient in repository connector layer 50 in the remote VFN transmitterfetches resource R from the local file server 25 holding resource R, ata file server fetch step 110. (This is the native file server thatresides on the same LAN as GW2.) The VFN transmitter stores resource Rin its cache, at a GW2 cache storage step 112.

[0249] Whether resource R was available in the cache of GW2 (step 108)or had to be fetched from the local file server (step 110), the remoteVFN transmitter in GW2 transfers resource R to the VFN receiver in GW1,at step 114. VFN gateway GW1 stores resource R in its VFN receiver cache76, in a GW1 cache storage step 116. The local VFN receiver then repliesto the original client request with resource R, at step 118.

[0250] Alternatively, resource requests can be served by the holder ofthe resource, as recorded in the owner-maintained VFN metadata, ratherthan from the owner. Preferably, before making such an access, the VFNmetadata is checked for recent modification or for a possible lock.Alternatively, it is sometimes more efficient to download a file from aVFN gateway other than the holder if the alternate gateway holds thecorrect file version and is enabled at the time of the download. Thismay be the case, for example, if the connection with the alternategateway has higher bandwidth or lower latency. The presence of a file onan alternate gateway is preferably determined by checking the LAR at thelocal gateway and the alternate gateway. Files too small to be recordedin the LARs are always downloaded from their holders. Preferably, arequest for resource VFN metadata is always served from the resourceowner in order to guarantee full consistency.

[0251] Caching

[0252] Caching is preferably implemented centrally for each LAN by VFNreceiver 48 on the LAN. Preferably, caching is performed on file blocksas well as entire files. Caching criteria are preferably parameterizedby resource-specific filters, which include:

[0253] Size range, which specifies a resource minimum and/or maximumsize for caching. (Typically the default is no size range limitation).

[0254] Authorized (HTTP-only), which specifies that the filter isparameterized with the HTTP authorization of resources. Allowed valuesare authorized only, unauthorized only, and ignore (which is preferablythe default).

[0255] Priority, which affects the cache replacement policy thatdetermines which resources are replaced when the cache is full and a newresource is requested. Priority caching can be specified forfully-qualified URLs or for content patterns.

[0256] The cacheability and maximum resource cache age (max_ageparameter) can preferably be controlled by use of appropriatedirectives. Greater control over a resource's time-to-live in the cachecan be achieved by setting an appropriate max_age value for theresource.

[0257] In addition to and separate from support for various consistencyguarantees, as described below, the VFN system preferably supports twocache priority levels: “sticky” and “normal”. “Sticky” priority providespseudo-mirroring of resources in the VFN receiver cache: so long as thepriority is not changed, and so long as there is sufficient disk spaceto hold all resources having this priority, resources enjoying stickypriority are not removed from the cache. If the VFN receiver isprevented from adding a new sticky resource to its cache, an error logentry is generated. In contrast to standard mirroring, the resourcecopying may be lazily driven by a client's request. For HTTP resources,sticky priority may be (but preferably is not) used to cache resourcesthat may not otherwise be cacheable per the HTTP specification.

[0258] “Normal” priority is used to provide standard popularity-basedcaching behavior, using cache removal policies that can be selected whenthe VFN system is configured.

[0259] The VFN receiver typically supports three alternative cacheremoval policies:

[0260] LRU (Least Recently Used), which is based on removing the leastrecently used resources from the cache to free up space in the cache fornew requested resources.

[0261] LFU (Least Frequently Used), which is based on removing the leastfrequently used (i.e., the least popular) resources from the cache tofree up space for new requested resources. When LFU is used, preferablyan LFU-Dynamic-Aging variant is used, in which an age factor is takeninto account in addition to frequency of usage.

[0262] GDS (Greedy Dual Size), in which size, effort to fetch, andpopularity are taken into account.

[0263] Preferably, the VFN receiver actively refreshes cache resources,based on the setting of the active refresh directive described above.This directive specifies when a VFN receiver should actively validate acached resource, rather than only passively refreshing a cached resourcein response to a client request. The active refresh may be used in orderto increase or decrease the consistency of the cached data. It isapplied only to resources that are already in the cache. Active refreshdirectives are preferably parameterized by content (fully qualified orpattern), time, and resource filters. Active refresh can operate on bothcached resources and exported resources, as described below.

[0264] Based on the setting of the active invalidate directive describedabove, the VAN receiver can actively invalidate (expire) a resource inits cache when the resource is no longer valid or available. Activeinvalidate directives are preferably parameterized by content (fullyqualified or pattern), time, and resource filters. The service may beused to delete resources from the cache or to ensure that a subsequentaccess will revalidate the resource with the VFN transmitter, withoutphysically removing the resource replica from the cache. For exportedresources, the invalidation preferably always physically removes thereplica from the exported area.

[0265] The VFN system preferably supports negative caching. When a VFNgateway on another LAN responds that a requested resource is not found,this negative response is cached by the requesting VFN receiver for acertain amount of time, so that the same request will not be repeatedunnecessarily. Negative caching of this sort generally reduces bandwidthconsumption and reduces resource request response time.

[0266] Performance of the VFN system additionally benefits from anylocal caching facilities provided by the network file system betweenclient 28 and VFN receiver 48.

[0267] HTTP Caching

[0268] Caching of HTTP resources is preferably integrated into the VFNsystem's general caching functionality, as described above. The approachthe VFN system uses for serving HTTP resources is similar to theapproach used for serving file system resources. HTTP server 60 servesresources transferred from a VFN transmitter 52 and cached in cache 76of VFN receiver 48. The VFN receiver accepts requests for standard HTTPmethods, forwards these requests to the VFN transmitter whenappropriate, and sends the response to the requests to the user client.

[0269] In addition, certain aspects of caching are unique to =resources.Aspects of Web content caching that are pertinent to this feature of thepresent invention are described in U.S. patent application Ser. No.09/785,977, whose disclosure is incorporated herein by reference. Inthis context, HTTP server 60 may serve cached HTTP and HTTPS resourcesthat VFN receiver 48 fetches directly from servers external to the VFNsystem, without these resources passing through a VFN transmitter. Suchexternal resources may be located on the Internet, the enterprise WAN,or an extranet. To support this direct VFN receiver caching of HTTPcontent, the VFN receiver acts as a caching HTTP proxy for domainsexplicitly directed to it. Such resources are preferably identified by acrawler that traverses their origin Web sites.

[0270] Setting the appropriate cacheability value (force caching, forcenon-caching or default) allows fine-tuning of the normalpopularity-based HTTP caching behavior in order to support partialcaching of dynamic content and to allow superseding the caching oflower-priority resources. Standard HTTP requests and responses may carryheaders that specify that they should not be cached. Additionally,standard HTTP resources with a query string (the format of which ishttp://<path>?<query>) are not cacheable by default. Settingcacheability to “force” overrides this default HTTP behavior bydisregarding the query parameters. Setting policy to “none” may preventpopular resources from competing with less popular resources that are ofhigher importance to the VFN operator.

[0271] The VFN system preferably supports inline modification of URLs inHTML pages to enable redirection of Web content, taking into accountmultiple origin Web sites. This approach generally minimizes the amountof required manual configuration. Preferably, cache 76 caches onlysuccessful responses to HTTP GET requests. All other responses arerelayed unmodified to the requesting client. The cache preferablyemploys common resource aging and expiration heuristics to improveresource consistency. Preferably, the VFN receiver supports partial HTTPrequests and responses.

[0272] Preferably, the VFN system supports simple caching of dynamiccontent. The desired URLs (up to the “?” character) are selected by theVFN administrator, and the VFN receiver caches the content based on theentire string, including everything after the question mark.

[0273] Preferably, the VFN receiver can be configured to support cachingof authorized (also called authenticated or private) content. Authorizedcaching is supported for content accessed through a VFN transmitter, andfor content fetched retrieved directly by a VFN receiver from an originWeb site. To implement authorized content caching, the VFN receivercaches the resource's data, but, before it grants the client access tothe data, the VFN receiver sends an authorization request to the properVFN transmitter, which is responsible for granting access to thecontent. Content may be tagged as authorized following either anauthorized request to a resource not previously cached or because theVFN system has pre-positioned the content. In either case, becausecontent may be mistakenly marked as authorized (for example, when aclient browser issued a request with a superfluous Authorizationheader), the VFN receiver may clear the resource's authorization tagfollowing a successful, non-authorized, request for the resource. Thisconfiguration is preferably applied to a VFN receiver's cache as a wholerather than on a per-resource basis, and is preferably enabled ordisabled continuously during the VFN receiver's operation (unlessconfiguration changes are made during operation). Authorized content canbe cached, if enabled, or negatively-cached, if desirable.

[0274] Preferably, the VFN receiver cache complies with HTTP version1.1, as specified by Request for Comments (RFC) 2616 of the InternetEngineering Task Force (IETF). HTTP 1.1 caching directives (according toRFC 2616, Sections 13 and 14) include the following:

[0275] Cache correctness;

[0276] Adherence to pragma: no-cache header values;

[0277] Partial support of the cache-control header;

[0278] Server expiration via the expires header; and

[0279] Support for resource validation headers: last-modified, date,if-modified-since, and if-none-match.

[0280] When serving HTTP requests, the VFN receiver preferably maintainsa finite state machine (FSM) for handling each request. The VFN receiverapplies all matching directive in the proper phases in the FSMtraversal.

[0281] Preferably, when a user client experiences delay in receiving alarge Web resource, the VFN receiver generates a Web page with estimatedavailability time. Notification upon resource availability may also beprovided by e-mail, pager, or other remote notification devices.

[0282] Edge Customization

[0283] Preferably, VFN receivers support URL translation, which enablesa VFN administrator to map a request directed to a source URL to arequest to some translation target URL. This service eliminates theroundtrip from the VFN receiver to the VFN transmitter and back.Preferably, URL translation can be customized by VFN receiver and bytime, such as time of day or week.

[0284] URL translation is parameterized by the source (one or moresource URIs or patterns), time, HTTP headers, and translation target.The translation target may be a single URL, allowing the mapping ofmultiple URLs to a single translation target, or a URL pattern, allowingthe redirection of part of the URL namespace identified by a prefixpattern to another prefix. Pattern-based translation replaces the sourceprefix with the destination prefix. If the source prefix is not presentin the URL, translation does not occur. Therefore, the source URLpattern should use the “starts-with” or “is” operators.

[0285] If multiple URL translations are defined for a source URL, thefollowing algorithm is preferably applied in order to ensure bothconsistency and multiple partial translations:

[0286] If any of the translations specifies a single (i.e., not pattern)destination, that translation is preferred over all others.

[0287] Otherwise, matching translations are applied in order (fromlongest to shortest source prefix, as measured by full path elementsspecified). Following each translation, the next translation in line ismatched against the target URL and discarded if no longer valid. If oneor more translations with the same path length are defined, the latertranslation is preferred over the earlier ones.

[0288] In a preferred embodiment of the present invention, the VFNreceiver supports request header modification, which appends HTTPheaders to requests en-route from the VFN receiver to the VFNtransmitter. The service can be parameterized by the source (one or moresource URLs or patterns), time, HTTP headers, and the list of headersand values to append. Appended headers are formatted as name/valuepairs. The name is defined in the directive, whereas the value may be afixed string specified in the directive or a system variable (which willbe replaced by the current value of the variable in the VFN receiver).System variables are defined by the manager console. They can beassigned separately for each VFN gateway, and their values may be null.

[0289] Pre-Positioning

[0290] In addition to on-demand retrieval and caching, remote resourcesare efficiently and transparently made available to clients by filereplicating (“pre-positioning”). Pre-positioning, like caching, isimplemented centrally for each LAN by its VFN receiver 48, under thedirection of its control agent 36.

[0291] Management subsystem 33 configures distribution-related policiesand issues distribution-related directives, as described above withreference to FIG. 5. Additionally, control agent 36 automatically andadaptively generates directives that, among other things, optimize thedetermination of which remote resources to replicate at each VFNreceiver and provide various levels of active synchronization. Based onthese policies and directives, selected resources are pre-positionedprior to a client request.

[0292] Such automatically-generated directives are preferably executedusing algorithms that determine which resources to pre-position and whento pre-position. Preferably there are two types of pre-positioningalgorithms:

[0293] Selective pre-positioning algorithms, which select the subset ofremotely-available resources to be pre-positioned based on ademand-to-modification rate ratio. Resources with a higher ratio ofexpected usage at the destination VFN gateway to expected modificationrate at the source are more likely to be pre-loaded. This ratio ispreferably updated using online measurements and an exponential windowaverage mechanism. Pre-positioning priority and frequency isconfigurable to meet the constraints of available bandwidth.

[0294] Adaptive scheduling algorithms, which determine the preferabletime and transfer rates to perform pre-positioning based on an availablebandwidth-to-demand-to-modification rate ratio. Available bandwidth isbased on historical traffic measurements indicating low-traffic andlow-latency periods. These measurements preferably include averagedelivery rate, number of concurrent connections required to achievemaximal rate, and connection latency. The values are preferably updatedusing online measurements and an exponential window averaging mechanism.

[0295] Virtual Directory

[0296]FIG. 7 is a schematic illustration of a virtual directory 80, inaccordance with a preferred embodiment of the present invention. EachVFN receiver 48 maintains a virtual directory of files held by remotefile servers on other LANs. All registered directory trees from theremote servers are pre-positioned in the virtual directory. Thedirectory information is preferably kept up-to-date, irrespective offile requests by its local clients, by tracking and notification ofchanges by the VFN transmitter or by active scanning and updating ofchanges by the VFN receiver. When the VFN receiver intercepts a requestfor file directory information or file metadata from one of localclients 28, the VFN receiver looks up the information on its localvirtual directory. The VFN receiver then returns the requestedinformation directly to the client, avoiding the delay that wouldotherwise be involved in requesting and receiving the information fromremote file server 25 across WAN 29.

[0297] Virtual directory 80 preferably includes file metadata, includingall file attributes that might be requested by a client application,such as size, modification time, creation time, and file ownership. Ifnecessary (as in the case of NFS, for example), VFN transmitter 52extracts this file metadata from within the files stored on the originfile server, wherein the file metadata is ordinarily kept.

[0298] Local storage of this file metadata in the virtual directory hasseveral advantages. Many file system operations require attributes ofnumerous files without requiring the content of those files. The virtualdirectory precludes the need to transfer and store these unnecessarycomplete files. By use of the local virtual directory, the VFN receiverprovides the client with fast response time to metadata-only operations,such as browsing the file system and property checking, as well as forperforming permission and validation checks against these attributes.For example, the use of the local virtual directory enables receiverapplication layer 40 of VFN receiver 48 to efficiently provide quickresponses to common file system operations such getting file attributes(getattr in NFS, for example). The virtual directory is also usedinternally by the VFN system, for example, for making consistencychecks, which can be done against metadata.

[0299] Virtual directory 80 stores an availability attribute for eachresource in the virtual directory. These availability attributesfacilitate responses to requests for file operation that require afile's contents, and not only its metadata. There are preferably threelevels of availability:

[0300] cached or pre-positioned in the VFN receiver's cache 76, shown ascached resources 82;

[0301] pre-positioned in the VFN transmitter's cache 77, shown astransmitter cached resources 84; and

[0302] remotely available, but not cached, shown as remote resources 86.

[0303] When responding to an intercepted file operation request on afile in virtual directory 80, the VFN receiver uses this availabilityinformation to determine whether to serve the file from cache 76 or torequest the file from its remote origin file server.

[0304] Consistency

[0305] As described above, the VFN system uses caching to improveperformance. Caching creates multiple replicas of a resource. When anyof these replicas are modified, they may become inconsistent with oneanother (although concurrent access generally occurs relativelyinfrequently). The VFN consistency protocol provides guarantees withrespect to the freshness of replicas, and provides mechanisms forpropagating modifications to replicas. There are three consistency pathswithin the VFN system:

[0306] between client 28 and VFN receiver 48. Consistency along thispath is handled by the cache-consistency protocol of the network filesystem native;

[0307] between VFN receiver 48 and VFN transmitter 52. Consistency alongthis path is handled by the VFN system; and

[0308] between VFN transmitter 52 and file server 25. The VFN systempreferably provides consistency along this path, as well. Thisconsistency is desirable because users outside of the VFN system can useand modify resources held by file server 25 concurrently with VFN systemaccess to the same resources. Elements of the native network file systemconsistency protocol are preferably used between repository connector 50and external file servers, depending upon the capabilities of the originfile server, such as change notification. Additionally, a VFN file agentis preferably used, as described below.

[0309] Preferably, the VFN -system supports three levels of consistency,which can be configured, for example, for individual files, file types,origin servers, or a combination of these parameters:

[0310] Strict consistency, the highest level of consistency, ispreferably implemented using a client-driven approach, whereby the VFNreceiver queries the VFN transmitter on each access to a resource inorder to determine if the cached resource is still valid.

[0311] High consistency, which is a middle level of consistency, ispreferably implemented using a server-driven approach using leases, asdescribed below.

[0312] Relaxed consistency, a lower level of consistency, is preferablyimplemented using a client-driven approach, whereby the VFN receiverperiodically queries the VFN transmitter in order to determine whethercached resources are valid, preferably using the algorithms describedbelow.

[0313] In relaxed cache consistency, if a maximum age parameter(max_age) has been defined for a resource by the VFN managementsubsystem, this value is used to determine when to validate theresource. Otherwise, if the resource is an HTTP resource, and itincludes the HTTP headers “expire” or “cache-control: max-age header,”the values in these headers are used to determine when to validate theresource. For non-HTTP resources, if the last modification time of theresource is known (because it was passed internally in the VFN systemthrough a “last modified header” parameter), the maximum age iscalculated as follows:

max_age=0.2* (current_date−last_modified)

[0314] Otherwise, when the resource has no last modification timestamp,the maximum age of the resource is set to a default (default_age), whichis specified in the local configuration file. (Typically, this defaultis 15 minutes). If no max_age parameter has been defined and thecalculated age is greater than a maximum default boundary(max_resource_age) (which is specified in the local configuration file),the max_age of the resource is decreased to max_resource_age. Thedefault for max_resource_age preferably is one day.

[0315] In order to implement high consistency between VFN receivers andVFN transmitters, consistency is preferably managed centrally for eachresource by the VFN transmitter that owns the resource. Alternatively,the VFN system may use a distributed approach to consistency management,such as a token passing scheme.

[0316] Pursuant to the preferred central management approach, leasemanager 44 in VFN transmitter 52 and lease client 38 in VFN receiver 48communicate with one another and together implement leasing. Preferably,the VFN system uses a server-driven lease-based consistency protocol. Alease provides the VFN receiver with permission to perform a specifiedoperation (for example, read or write) on a specified resource (forexample, a file or directory) for a specified duration (timeout period).While the lease is valid, the VFN receiver may perform the specifiedoperation without contacting its peer VFN transmitter (with theexception of write-back of changes, which is described below). Leasesare preferably granted on a per-file or per-directory basis rather thanon a per-file-block basis, even though file block transfers between VFNgateways are supported.

[0317] Advantageously, a lease held by a VFN receiver's lease clientserves all clients 2.8 of the VFN receiver. As a result, the validity ofthe lease is not affected as long as all operations, includingoperations by multiple clients, are performed against the local VFNreceiver. A lease must be revoked, as described below, only when aclient of another VFN receiver issues a conflicting request for theleased resource. The approach of the VFN system to leasing generallyprovides data consistency with bounded synchronization guarantees sothat substantially no stale data is served.

[0318] Preferably the lease data structure is as follows:

[0319] {object id, object version, lease type, grant time, duration,epoch}

[0320] wherein object id is a unique identifier for each resource,object version indicates the version of the resource, lease type is thespecified operation for which the lease has been granted, grant time isthe time the lease was granted, duration is the duration of the lease,and epoch is an identification of a specific VFN transmitter instance.Epoch may be used to allow leases to be revoked and/or reclaimed after aserver restart or network disconnection, by allowing the server andclient to determine which “instance” of the VFN transmitter granted thelease.

[0321] Lease manager 44 tracks lease holders using the following datastructure for each lease issued:

[0322] {object id, VFN ids of lease holders, usage type}

[0323] wherein the VFN ids are unique identifiers of lease clients 38that hold the leases, and usage type is the type of usage the leasepermits (read-only, write). Preferably the usage type is used tooptimize the lease duration for typical use scenarios by recordinginformation about past usage.

[0324] Lease client 38 tracks the leases it holds using the followingdata structure:

[0325] {lease id, client modification log for update propagation}

[0326] wherein lease id is an unique identifier for each lease, and thelog keeps track of modifications made by the client for use duringpropagation of updates to the origin VFN transmitter, as describedbelow.

[0327] A lease is typically granted by lease manager 44 in response to afirst resource operation request made by a VFN receiver to a VFNtransmitter. For example, during the first read or validation of aresource by the VFN receiver, or when the VFN receiver sends its firstmodification made to a resource, lease client 38 of the VFN receiverrequests a lease from the lease manager of the VFN transmitter. If thelease manager approves the lease request, the lease manager returns alease and, if the lease request was piggybacked on another operationrequest, the VFN transmitter returns an operation status responding tothe other operation request. A lease manager can deny a lease request,by not returning a lease or returning a zero-length lease, in which caseVFN receiver operations must be performed directly on the resource heldby the VFN transmitter. To reduce message traffic, whenever possible,consistency messages and requests for operation are piggybacked on datarequests.

[0328]FIG. 8 is a flow chart that schematically illustrates a method forrequesting a read operation, in accordance with a preferred embodimentof the present invention. This method is used when client 28 requestsfrom a VFN receiver a read operation on a resource registered with theVFN system and held by remote file server 25, and the VFN receiver doesnot already hold a read lease for the resource. After the request hasbeen intercepted by the VFN receiver of the local VFN gateway GW1, asdescribed above with reference to FIG. 6, the VFN receiver's leaseclient 38 requests a read lease from the lease manager 44 of the VFNtransmitter that is the resource owner, at a read lease request step120. The lease manager checks whether any other lease clients hold validwrite leases for the resource, at a write lease check step 122. In sucha case, the lease manager denies the read lease request, at a leasedenial step 128. Access to the requested resource is still provided tothe client, at a validated access step 130, in the manner describedabove with reference to steps 102 through 118 of FIG. 6. However, eachclient access to the resource requires validation of the resource withthe original version of the resource held by origin file server 25. Uponeach subsequent read request, the method is repeated beginning with step120. After the interfering write lease has terminated, a read lease canbe granted as described in the next paragraph.

[0329] If no other lease clients hold valid write leases, the leasemanager grants the requested read lease, at a lease grant step 124. Inthis case, all read operations are performed locally at the VFNreceiver, at a local access step 126. Validation of the resource withthe original of the resource held by the origin file server 25 is notrequired.

[0330] It should be noted that a read request is denied when a writelease is held by another lease client, but not when another read leaseis held by another lease client. Therefore, multiple VFN receivers (andmultiple clients for each VFN receiver) can read a resourcesimultaneously. Each lease client renews the lease, using steps 120through 126, as long as its client 28 is active.

[0331] The granted read lease remains valid until the earliest of: (i)the occurrence of its pre-set timeout in the absence of a renewalrequest, (ii) the voluntary revocation of the lease by the lease clientbecause it is no longer needed, or (iii) the revocation of the lease bythe lease manager, such as when another lease client requests a writelease for the resource, as described below.

[0332]FIG. 9 is a flow chart that schematically illustrates a method forrequesting a write operation, in accordance with a preferred embodimentof the present invention. This method is used when a client 28 requestsfrom a VFN receiver a write or read-write operation on a resourceregistered with the VFN system and held by a remote file server 25, andthe VFN receiver does not already hold a write lease for the resource.After the request has been intercepted by the VFN receiver of the localVFN gateway GW1, as described above with reference to FIG. 6, the VFNreceiver's lease client 38 requests a write lease from lease manager 44of the VFN transmitter that is the resource owner, at a write leaserequest step 132. The lease manager checks whether any other leaseclients hold valid read leases for the resource, at a read lease checkstep 134. In such a case, the lease manager revokes all of the otheroutstanding read leases for the resource, either asynchronously orsynchronously, at a revoke other read leases step 142.

[0333] In any case, the lease manager next checks whether any otherlease clients hold valid write leases for the resource, at a write leaseoutstanding check step 136. If so, the lease manager revokes alloutstanding read and write leases for the resource, at a revoke allleases step 144, and forces the lease clients in VFN receivers holdingany revoked write leases to flush updates to the peer VFN transmitters.The lease manager next checks the frequency of read and write activityof previous read and write lease holders, at a check activity level step145. If the activity level was low, which may indicate that a lease washeld but not needed, the lease manager proceeds to a read lease checkstep 137, described below. On the other hand, if the previous leaseholders were active, the lease manager denies the write lease request,at lease denial step 146. Access to the requested resource is stillprovided to the client. However, each client access to the resourcerequires validation of the resource with the original of the resourceheld by the origin file server 25, and all writing must be performed bywrite-through to the original resource held by the original file server25, at a write-through step 148. Upon each subsequent write request, themethod is repeated beginning with step 132. After the interfering writelease has terminated, a write lease can be granted.

[0334] On the other hand, if no write leases are outstanding for theresource or outstanding read and write leases were inactive, asdetermined at step 145, and if the lease manager is revoking read leasessynchronously, the lease manager checks whether any read leases wererevoked at step 142, at read lease check step 137. If so, the leasemanager waits until the earlier of (i) the acknowledgement by leaseclients of any read lease revocations issued at step 142 or (ii)expiration of the read leases for which revocations were issued at step142, at acknowledgement/expiration wait step 138. If, on the other hand,the lease manager is revoking leases asynchronously, the lease managerskips step 137. In either case, the lease manager then grants the writelease (or grants the lease immediately, if no read leases were revoked),at a lease grant step 139. The VFN transmitter commits the requestedmodifications (which it received from client 28 when client 28 requestedthe write lease) to the resource. As described above with reference tostep 128 of FIG. 6, further read leases are not granted while the writeis in progress. Preferably, short write leases are granted so as toallow the granting of read leases as soon as possible thereafter. If thelease manager detects that the reads are no longer active, it may grantlonger write leases.

[0335] After receipt of the write lease, all read operations by client28 are performed locally at the VFN receiver, as described above. Allwrite operations can be performed using a write-back cache scheme, asdescribed below, at a write-back caching step 140. When modifying theresource, the VFN transmitter increments the version number of theresource, which is used for synchronization and integration of changesfrom disconnected VFN gateways.

[0336] The granted write lease remains valid until the earliest of: (i)the occurrence of its pre-set timeout in the absence of a renewalrequest, (ii) the voluntary revocation of the lease by the lease clientbecause it is no longer needed, or (iii) the revocation of the lease bythe lease manager, which occurs when another lease client request awrite lease. Additionally, if another lease client requests a read leasefor the resource, the write lease holder is given the option todowngrade its write lease to a read-only lease. If the write leaseholder exercises this option, generally because the holder is no longeractively updating the resource, the read lease is granted. Otherwise,the read lease request is denied, at step 128, as described above.

[0337] The leasing approach described above ensures single copysemantics, whereby every read operation sees the effect of all previouswrite operations, and read and write requests cannot executeconcurrently. When revoking a lease because a resource has beenmodified, the VFN transmitter optionally includes hints (for example,ranges in a file that have been modified) in order to improve updatepropagation to VFN receivers that held leases on the previous version ofthe resource.

[0338] After a read lease has been granted, it can be upgraded to awrite lease upon a request by the lease client holding it. Similarly, awrite lease can be downgraded to a read lease after the VFN receiver hasflushed resource modifications to the VFN transmitter whose leasemanager granted the lease.

[0339] A lease is allowed to expire silently at the end of its specifiedduration if its associated resource is no longer needed by the VFNreceiver whose lease client holds the lease (for example, if a file hasbeen closed by its client 28). If the VFN receiver needs continuedaccess to the resource to proceed with an operation, the lease on theresource may be extended by the lease manager pursuant to a request bythe VFN receiver's lease client. Such extension requests are preferablypiggybacked on other data sent by the VFN transmitter and/or withrequests for invalidation of leases no longer needed. A lease can alsooptionally be extended independently by its granting lease manager,typically by piggybacking the renewal on other messages if the lease isabout to expire. The automatic expiration of leases removes anyassociated state at both the lease manger and lease client, withoutrequiring the use of any WAN bandwidth. This bandwidth conservation isparticularly advantageous when widely cached resources are modified.

[0340] In a preferred embodiment of the present invention, the leasemanager grants the lease client a dual lease, which combines a shortlease on the file set containing the resource (a “set lease”) and alonger lease on the individual resource (an “object lease”). A file setis a logical grouping of related resources, typically a whole share,such as an NFS mount point or a CIFS network share, or a directory.Different file sets can also be configured by a VFN administrator basedon criteria such as spatial or temporal locality of resources. The useof a set lease reduces the bandwidth and processor costs of renewingleases by amortizing the cost of renewal over multiple relatedresources, and also may provide faster failure recovery. These savingsgenerally more than compensate for the relatively frequent renewalsnecessitated. The combination of a set lease and an object leasetypically provides the fault tolerance and consistency of short leaseswith the low overhead and performance benefits of long leases. The VFNreceiver provides access to its cached resources to clients 28 so longas both the object and set leases held by the VFN receiver's leaseclient are valid.

[0341] In another preferred embodiment of the present invention, thedefault behavior of the VFN system is customized to improve file sharingin several common application classes. For example, for a large class ofapplications, such as applications that require resource-sharing andprocess-synchronization over a network, tight file contentsynchronization is less important than maintaining file system structuresynchronization. Typically, these applications create files to serve assemaphores or locks in order to achieve atomicity during criticaloperations. For this class of applications, the VFN may be configured tohandle file creation and deletion in write-through mode, therebyallowing global application synchronization across VFN gateways.

[0342] A second common application class creates temporary files (oftenmultiple large files) in shared directories that should not beavailable, or even visible, to a remote site. The VFN system preferablyallows the specification of file types that should remain local to eachVFN gateway and exempt from the consistency protocol.

[0343] Preferably, a VFN administrator can configure the VFN system toprevent granting of write leases for certain resources during specifiedtime periods. For example, write leases may be prevented every day at acertain time when backup and file system updates are scheduled.Directives can also be issued that mandate write-through for certainresources. Update-delete conflicts that arise are preferably resolved asthey would be on the origin file server.

[0344] Because the VFN system is distributed over multiple remote sites,it should be designed to gracefully handle conditions such as networkfailures or intentional bandwidth limitations. Thus, for example, thetimeout periods of leases in the VFN system ensure that a VFNtransmitter can continue to commit changes to resources despite anoccasional connection or VFN receiver failure. In the event of such afailure, the VFN transmitter, in order to commit changes, does not needto wait indefinitely for the VFN receiver's lease client to acknowledgethe VFN transmitter's lease manager's lease revocation, but rather onlyfor the lease to expire. Lease client 38 also participates in failurerecovery by renewing leases it held prior to the failure or disconnect.

[0345] Disconnected VFN receivers can continue optimistically servingresources to their local clients. However, because such disconnectedresource access cannot provide hard consistency guarantees, the VFNsystem may restrict such access to read-only: (This may be accomplishedby having the lease client issuing dummy local read-only leases.)Read-only access is provided for cached and unauthorized HTTP resources.Alternatively or additionally, during disconnected operation, when auser requests a file that is marked as requiring strong consistency, afile-not-found exception is returned to the user.

[0346] Further alternatively, during disconnects, local clients mayoptimistically continue making changes locally. These changes must laterbe reintegrated with the origin resource held by file server 25. Uponreintegration, lease clients reconnect to lease managers and request newread leases. Lease clients also attempt to reestablish write leasespreviously held. Lease managers may renew a previously held write leaseif the original write lease was for the same version of the resourcecurrently on the origin file server 25. If these write leases are stillavailable, modifications made since the last write update are sent tothe VFN transmitter. If these write leases are not available, mostchanges can be applied automatically and only write-write conflicts musthandled with manual intervention (although write-write conflicts aregenerally very infrequent). In either case, while in disconnected mode,each VFN gateway provides a consistent view of the set of its ownlocally cached files. When communication is reestablished after adisconnection period, VFN receivers preferably attempt to reestablishthe validity of all cached replicas of resources (possibly using asingle per-volume check).

[0347] In order to enable lease manager 44 to revoke leases held bylease client 38, the VFN receiver preferably is able to acceptconnections from the VFN transmitter, in addition to its usual functionof establishing such connections. If security considerations prohibitsuch connections (since firewalls are often configured not to acceptremote HTTP and FTP connections), the VFN transmitter and VFN receivercan emulate bi-directional communication over unidirectional transport,as described below in the section regarding the adaptation layer, andthereby maintain HTTP and firewall friendliness. Alternatively, ifbi-directional communication is not possible, revocation is initiated bythe lease client holding the leases, by periodically polling the stateof leases for a selected list of resources, termed the working set,which consists of frequently accessed resources. In this implementation,access to resources that are not in the working set requires validationand write-through.

[0348] Reference is now made to FIG. 10, which is a block diagram thatschematically illustrates the deployment of a VFN file agent 90, inaccordance with a preferred embodiment of the present invention.Preferably, a non-VFN local native client 92 can use and modifyresources held by file server 25 concurrently with VFN system access tothe same resources. To handle this possibility, the VFN system uses VFNfile agent 90 to maintain consistency between VFN transmitter 52 andfile server 25. The VFN file agent functions as a watchdog that notifieslease manager 44 of VFN transmitter 52 in local VAN gateway 22 whenchanges to resources registered with the VFN transmitter have been madedirectly by local native client 92.

[0349] Alternatively, the VFN transmitter may periodically poll theorigin file server to ensure file consistency. When such local-clientfile server writes are detected, the VFN transmitter's lease managerrevokes all leases for the modified resource. If any modifications havebeen made to the same resources by a holder of a write lease, thesemodifications are merged or discarded, based on the preconfiguredpolicies set by management subsystem 33. To enable merging, modificationrecords may be time-stamped, in which case the VFN system uses the copywith the latest modification time-stamp, and preferably logs a warningthat the conflict has occurred. Alternatively, the system may beconfigured to always prefer the copy held by file server 25.

[0350] Alternatively or additionally, a CIFS client in a VFN transmittermay open files in shared mode on the local file server while a remoteVFN receiver is writing a file locally. When the file is opened by theVFN transmitter, and the CIFS client is granted an CIFS opportunisticlock (op-lock) from the origin server, the VFN transmitter preferablyuses the op-lock as a guarantee of exclusivity (read-write caching orread-caching only). This approach allows more efficient synchronizationbetween the VFN transmitter and the origin server. When using op-locks,in order to preserve strict coherency, all CIFS directory operation areperformed directly on the origin file server, because CIFS op-locks lockonly files and not directories.

[0351] Preferably, a VFN administrator can configure the polling rate ofVFN transmitter 52 to increase or decrease the consistency level,resulting in a higher or lower load on file server 25. Consistencybetween VFN transmitter 52 and file server 25 is preferably configuredto be lower than consistency between VFN transmitters and VFN receivers,to avoid incurring a prohibitive overhead and load on the VFNtransmitter or origin file server. Optionally, if the file server'slocal clients require stronger consistency, these local clients canaccess the most current replica through the local VFN gateway (loop-backaccess).

[0352] In a preferred embodiment of the present invention, the VFNsystem adaptively optimizes the duration of leases by operation type.This optimization involves a trade-off between increasing WANcommunication efficiency (by using longer leases) and reducing VFNtransmitter server state (by using shorter leases). Shorter write leasesalso potentially provide stronger consistency. Preferably, the durationof a lease is set to the longest time possible that is not likely torequire revocation. For this purpose, the VFN transmitter varies thelease period based on the type of resource in order to match file usagescenarios. For example, “read-only” resources can have relatively longerlease periods than writeable resources.

[0353] The VFN system preferably employs different consistency levels asappropriate for each resource type. For example, the VFN systemtypically provides strong consistency for resources held by file serversand weak consistency for resources held by Web servers. For resourcesheld by Web servers, the VFN system preferably uses standard HTTP cachebehavior. Preferably, the default cache policy for FTP servers providesrelaxed consistency guarantees, similar to those for HTTP, because FTPitself does not make consistency guarantees. In order to apply theappropriate level of consistency, the VFN system keeps track of the typeof server from which each resource originated, as described above. Thesegeneral rules may be varied by directives issued by the VFNadministrator, so as to provide stronger or weaker consistency forspecific resources or types of resources, as described above.

[0354] The VFN system's use of leases provides several benefits. Strongconsistency guarantees can be provided even when there are multipleconcurrent readers and writers, because a VFN transmitter must notifyVFN receivers holding valid leases of any pending changes to resource.Leases improve system performance because most operations can becompleted by the VFN receiver locally. Write-write and read-writeconflicts between users of the same VFN gateway are resolved locally.Additionally, because leases are typed by their operation, they minimizefalse client invalidations for read sharing, which sometimes occur indistributed file systems that use leases or callbacks that are nottyped.

[0355] Concurrency Control

[0356] VFN gateways 22 preferably provide full native network filesystem functionality to clients 28, including support for externalapplication-generated lock requests. The support of leases forconsistency and support of locks for concurrency in the VFN system areessentially unrelated functions, although there are certain similaritiesof implementation. (Locks can be viewed as a special type of leases.)Consistency is an internal VFN system function, while locks aresupported to provide a service to external user applications.Preferably, file locking is supported for multiple operating systems,including support for the UNIX NLM (Network Lock Manager, the NFSnetwork locking manager), and the Win32API access modes and sharingmodes for files in Windows.

[0357] File locking is used by processes to synchronize access to shareddata. File systems typically provide whole file or byte-range locking oftwo types: mandatory and advisory (also called discretionary). Mandatorylocking is enforced by the file system. It prevents all processes,except those of the lock holder, from accessing the locked file.Advisory locking prevents others from locking a file (or a range withinthe file), but does not prevent others from accessing the file. It canbe effective between cooperative processes only.

[0358] The VFN system preferably supports both mandatory locking, as isused in CIFS, and advisory locking, as is used in NFS. Both mechanismsare used to support lock requests from user applications. Mostpreferably, byte-range locking is supported, as well, for both CIFS andNLM. Optionally, the VFN system supports interoperating CIFS and NLMfile locking and sharing operations (at VFN transmitters and/or VFNreceivers). When such support is provided, operations contending for thesame resource must adhere to the stricter locking paradigm, i.e.,mandatory locking, while maintaining the correct operation of otherclients.

[0359]FIG. 11 is a block diagram that schematically illustrates detailsof VFN system 20 that relate to lock management, in accordance with apreferred embodiment of the present invention. VFN transmitter 52comprises at least one lock client 150, and VFN receiver 48 comprises alock server 154. (These elements of VFN gateway 22 were omitted fromFIG. 3 for the sake of simplicity.) The lock client and lock servercommunicate with one another over WAN 29 and together facilitate theissuance and management of locks. Alternatively, lock client 150 andlock server 154 can be implemented as part of transmitter applicationlayer 42 and receiver application layer 40, respectively, rather than asseparate components of VFN transmitter 52 and VFN receiver 48.Preferably, VEN transmitter 52 comprises a separate instance of lockclient 150 for each file server 25 to which it is connected, or,optionally, for each mount point on each file server.

[0360] Locks in the VFN system preferably have the following datastructure:

[0361] Lock={object id, client id, grant time, duration, epoch}

[0362] wherein object id represents the identity of the resource towhich the lock applies, using the internal resource identificationnumbers of the VFN system. For lock clients, client id denotes the peerlock server from which the lock request was received. For lock servers,client id denotes the process on the client 28 that requested the lock.Grant time and duration are used for automatic lock expiration, asdescribed below. Epoch is an identification of a specific applicationinstance (comprising, for example, one or more of the followingparameters: machine id, process-id, process creation time, or a randomvalue). Epochs are used to facilitate coordination of shared state in adistributed application. They are used to determine if the shared statewas created by the instance with which an application is currentlycommunicating (for example, in the case of a reconnect) or a previousinstance (for example, in the case of a restart).

[0363] Lock server 154 accepts lock and unlock requests from clients 28.Upon receiving a request, the lock server preferably performs certainmanagement functions, such as issuing any denials based onlocally-available information and/or caching and combining requests forshort periods in order to enhance system performance. If the request isnot denied, the lock server then passes the request to the lock clientthat resides in the VFN transmitter that owns the resource. Uponreceiving a response from this lock client, the lock server forwards theresponse to its client 28. Lock server 154 preferably shares data withthe servers in interception layer 54 (FIG. 3), such as with NFS server56, to ensure that locking is supported on a per gateway basis.Preferably, lock server 154 supports NLM Version 3 in order to supportNFS Version 2 user requests, and NLM Version 4 in order to support NFSVersion 3 user requests.

[0364] Lock client 150 accepts lock and unlock requests from lock server154, preferably through a CGI interface. The lock client checks whetherthe requests conflict with any other remote locks that the lock clienthas issued. If so, the lock client preferably resolves the conflict byusing arbitration logic. If not, the lock client executes the requestson file server 25, which in turn executes the request on its origin copyof the resource, using the file server's native locking support (thatis, outside the VFN system). Execution on the origin file server isnecessary in order to provide end-to-end coordination of locks. The lockclient waits until it receives a response from file server 25, andpasses this response to the lock server. This synchronous operation ofthe lock client and server with the file server ensures correctarbitration of lock requests between multiple VFN receivers and avoidpossible deadlocks. The lock client preferably maintains tight controlof all lock requests issued to file server 25 in order to avoidaccidentally reissuing a request (for example, for a different client),which might result in the lock client locking itself out of access to aresource.

[0365] Preferably lock client 150 tracks outstanding locks using thefollowing data structure for each lock issued:

[0366] Map={lock id, lock}

[0367] Lock id is a unique identifier for each lock issued, and lock isthe lock object, whose data structure is described above.

[0368] In order to maintain a lock on a file, operating systemsgenerally require that the file handle for the file remain open.Therefore, in order to maintain locks on files held by origin fileserver 25, the VFN transmitter keeps locked files open on the fileserver. Preferably, in order to enable scaling of the VFN system tosupport the issuance of large numbers of simultaneous locks, the VFNtransmitter supports the issuance of more locks than the number ofsimultaneous handles allowed by the operating system for one process.For example, the default maximum number of handles per process on UNIXis 1000, including all communication handles such as file handles,sockets, and pipes. Support of larger numbers of locks is preferablyaccomplished in the VFN system by spawning external slave processes onlyfor the purpose of maintaining open handles. These external processesare supported by a protocol between the origin VFN transmitter and itssubsidiary slave processes. Optionally, these slave processes maycontrol lock agents to physically place and remove locks fromrepositories.

[0369] Locking in system 20 can typically use at-least-once semantics,because reissuing a held lock to the same client is generally notharmful. The exception to this generalization is when the network filesystem on server 25 uses reference-counting of locks, in which case asingle response to each request is preferably ensured. When usingat-least-one semantics, the protocol between the lock server and lockclient typically does not need to ensure a reliable WAN connectionbecause retransmissions are permitted.

[0370] Preferably, lock server 154 supports lock and unlock requestsgenerated not only by clients 28, but also by the VFN receiver itself.This feature enables the VFN system to generate internal lock commands(i.e., not user application-generated) for enhancing consistencyguarantees. For example, if a file is locked by the VFN system on theorigin file server (even though the lock was not requested by the clientaccessing the file), the file cannot be modified without permission fromthe VFN transmitter. This approach generally provides betterconsistency, albeit at the cost of reduced concurrency, which is oftenan acceptable tradeoff. Additionally, the repository plug-in APIpreferably supports locking.

[0371] Preferably, the VFN system implements internal delays whenexecuting unlock operations in order increase efficiency and reduce loadon the VFN transmitter and origin file server. End-user applicationstypically request repeated locks for a file or region of files.Preferably, when an application requests an unlock operation for a fileor region, the VFN receiver locally marks the file or region asunlocked, but does not relay the unlock request to the VFN transmitter.This local unlock is preferably assigned a relatively short expiration(such as less than 10 seconds), after which the unlock request is sentto the VFN transmitter. During the period prior to expiration, ifanother local lock is requested, this lock operation is completedlocally at the VFN receiver, without the involvement of the VFNtransmitter. Additionally, if the VFN transmitter receives a lockrequest from a first VFN receiver for a file that the VFN transmitterbelieves is locked by a second VFN receiver, the VFN transmitterconsults the second VFN receiver whether it is possible to unlock theresource. In such a case, the second VFN receiver will preferablyrelease any delayed locks it is holding without active user locks, orwill refuse the request if the lock owner is a “real user.” This methodof lock delegation is effective in a typical case of repeated access orlow contention (if the delay period is sufficiently long).

[0372] If liveliness status is required in the origin file server, itcan be piggybacked on the current VFN monitoring.

[0373] In the preferred embodiment shown in FIG. 11, VFN transmitter 52and VFN receiver 48 each comprise a status monitor 158. Each statusmonitor 158 comprises a lock status monitor 152, which monitors thestatus of the VFN gateways in order to enable lock client 150 and lockserver 154 to recover from reboots and system crashes. Alternatively,the functionality of lock status monitor 152 can be provided by othermonitoring utilities in the. VFN gateway, rather than by a separatecomponent. Preferably, locks are released and not reestablished upon acrash. Alternatively, locks are reestablished, and the lock statusmonitors maintain consistent state to enable such reestablishment. Forefficient recovery from crashes, each lock request is preferablyassigned a unique identification number that is granted for a specifiedduration. Locks not renewed during their periods expire automatically,in a manner similar to the expiration of non-renewed consistency leases,as described above. The lock agent in the origin site must maintainpersistent list of files (or byte ranges) that are locked, to allowtheir release after a crash.

[0374] Preferably, status monitor 158 in VFN receiver 48 furthercomprises a network status monitor (NSM) 156, which providescrash-recovery services to clients 28 implementing NFS, pursuant to thestandard NFS NSM protocol. Optionally, the standard NSM daemon (calledstatd) can be used as this component for VFN receivers residing on aUNIX server. Alternatively, NSM 156 can be implemented as part of theVPN receiver, rather than as a separate component. For protocols, suchas CIFS, that drop shared state (open file handles, locks, etc.) upondisconnection, the VFN receiver preferably disconnects active clientswhen disconnected from the VFN transmitter or when the VFN transmitterhas been restarted. The VFN receiver preferably detects suchdisconnection and restarts using its monitoring information and epoch,as described above.

[0375] Crawling and Archiving

[0376] In a preferred embodiment of the present invention, VFNtransmitter 52 comprises a crawler component (not shown) that traverseslocal file systems, HTTP, and FTP directory trees in order to generate alist of available resources. This information is used, inter alia, forpre-positioning of resources, subject to appropriate directives andparameters, as described above. The VFN transmitter sends this list toits peer VFN receivers, which pre-position the resources as scheduled.Preferably the crawler monitors changes in specified directories byperiodically generating a current list of resources and theirattributes, which may be used in the virtual directory, as describeabove.

[0377] Preferably, VFN transmitter 52 also comprises an archivercomponent. When the crawler encounters resources that are tagged withthe archive parameter, as described above, the archiver packages all thetagged resources into a single archived and compressed file, such as aZIP file. The VFN receiver downloads the compressed file duringpre-positioning and extracts the resources.

[0378] The crawler and archiver may be implemented as services in asingle servlet container, such as an Apache Tomcat servlet container.Alternatively, the crawler and/or archiver may be deployed asstand-alone components, rather than as components of the VFNtransmitter.

[0379] Export and Import

[0380] In a preferred embodiment of the present invention, VFN system 20supports the export of remote resources, via a VFN receiver, intonon-VFN native file systems. User applications can directly access theseexported resources via the appropriate native file system. Resourcesexported from a VFN receiver preferably maintain the same relative paththat the resources have on the source VFN transmitter. The local nativefile system root path of the export is determined based on the localconfiguration of the VFN receiver. The Uniform Resource Identifier (URI)of the resource determines the relative path from the root, in a mannerthat is specified in applicable directives. File properties of exportedfiles, such as size, modification time, and owner, are preferablyidentical to the properties of the source file.

[0381] Responsive to a synchronization parameter in an export directiveand specific metadata regarding each resource, the VFN system preferablykeeps these exported resources synchronized with their original copies.All VFN cache operations, including pre-positioning, updating, andinvalidation can be applied to exported resources. Because access toexported resources cannot be intercepted by the VFN receiver, theconsistency and view of the exported resources may not always beaccurate and/or complete. Typically, the VFN gateway does not enforceaccess rights for exported resources, although enforcement of suchaccess rights is possible.

[0382] Export characteristics are preferably configured through thelocal configuration file of each VFN receiver. By default, resourcesbrought into the VFN receiver's cache are typically not automaticallyexported, but automatic export to an external file server may beconfigured, for example, for backup. File and directory mode attributesfor export are likewise configurable at the local VFN receiver. The modeattribute can be set to one of the following values:

[0383] no_duplicate: operations are carried out only on the cache of theVFN receiver.

[0384] duplicate_prefetch: when resources are pre-positioned they arealso exported.

[0385] duplicate_all: any cache operation applied to a resource is alsoapplied to the corresponding exported resource.

[0386] Preferably, the VFN system supports authenticated file export toFTP servers, as well as the import of resources held by local nativefile systems into the VFN system.

[0387] Fetching Queue

[0388] Each VFN receiver 48 preferably maintains a queue of requests forthe fetching of remote resources. The queue is ordered by the priorityof the requests. Preferably two or three priority levels are supportedby adaptation layer 45. Priority is preferably in the following order:

[0389] current user application requests;

[0390] read-ahead requests;

[0391] requests scheduled by VFN administrator directive;

[0392] locally-generated automatic pre-positioning requests; and

[0393] automatically-triggered replication requests, which arereplication requests initiated by the VFN system without interventionthrough a directive, These requests are preferably initiated based oninternal heuristics and algorithms of the VFN system, such as resourcepopularity and change frequency.

[0394] Lower-priority requests are deferred unless there is excessbandwidth. When bandwidth is insufficient to simultaneously transfer allqueued requests, lower-priority requests may be frozen (preferably atthe TCP level) in order to reduce competition for bandwidth. Aftercurrent-user requests are fetched, the VFN receiver preferably waits acertain amount of time prior to fetching any other requests. This delayoften improves performance for the user, because user requests arefrequently bursty and highly time-correlated. Preferably, applicationtransport layer 46 provides self-regulation of queue length, includingscheduling shortest tasks first and performing gate control (i.e.,refusing new tasks under certain conditions).

[0395] Web Access to the VFN System

[0396] In a preferred embodiment of the present invention, VFN system 20supports Web access to registered file system resources. A “home page”is provided at a VFN gateway, containing the root directories of allregistered file servers. Users can use this home page to browse theremote file systems, without the need to define an HTTP proxy in theirbrowsers. Additionally, the VFN system preferably includes a componentthat serves registered resources held by network file systems as HTTPcontent. HTTP clients without correct credentials are generallyprevented from accessing files cached in the VEN receiver cache

[0397] The VFN system preferably provides support for user client accessto FTP resources. Such access is provided by translating the FTPresource into HTTP for use by the client, via a URL translationdirective. Such FTP requests and responses are automatically gated andtransformed by the VFN receiver. The FTP client can operate in either anactive mode, in which it opens and listens to a data port, or in apassive mode, in which it becomes active only on demand. Preferably, theVFN receiver additionally supports the WebDAV protocol.

Adaptation Layer

[0398] Adaptation layer 45 (FIGS. 3 and 4) provides the VFN transmitterand receiver application layers with high-level services forbidirectional inter-VFN gateway communications over the WAN. As shown inFIG. 4, the adaptation layer of a VFN transmitter communicates with theadaptation layer of a VFN receiver of another VFN gateway.

[0399] If security considerations prohibit native bidirectionalconnections (since firewalls are often configured not to accept remoteHTTP and FTP connections), the VFN transmitter and VFN receiver canemulate bi-directional communication over unidirectional transport,preferably using one of the following methods. The best choice of methoddepends on network and firewall configurations, with the first methodpreferable if it is supported.

[0400] The VFN transmitter uses HTTP/1.1 chunked responses and requestpipelining over persistent connections after the establishment of theinitial session-like communication. The VFN transmitter sends data as achunk of some response, thereby emulating a non-ending response. Whenanother request is received on the same connection, the response can bebroken off and a new chunked response established for the new request.This approach allows the VFN transmitter to asynchronously send messagesto the VFN receiver as soon as the messages are available. The VFNreceiver does not need to know the length of the entire response (thatis, the sum of the chunks), but only the length of each chunk as it isbeing sent.

[0401] The VFN receiver periodically polls the VFN transmitter bysending a “get-pending-messages” request. The VFN transmitter replieswith queued messages. This approach is generally used with HTTP/1.0,which does not support chunked responses.

[0402] The chunked response approach generally provides betterresponsiveness and bandwidth utilization than the polling approach,because socket creation and destruction is eliminated from the path ofeach request, and additional TCP send/receive windows have a betterchance of adapting to the network over the course of prolongedconnection.

[0403] The adaptation layer is implemented on top of applicationtransport layer 46, which is described below, and implements featuresused in the VFN system to enhance WAN performance and utilization.Preferably four file system operations are optimized in adaptation layer45: read, write, open, and close. Other common operations, such asdirectory-related operations, are preferably optimized in the VFNtransmitter and receiver application layers, as described above.Alternatively, some or all of the services described in this section areimplemented in application transport layer 46 and/or in VFN transmitterand receiver application layers 40 and 42.

[0404] Read

[0405] Adaptation layer 45 supports inter-VFN gateway data transfersrequested by the transmitter and receiver application layers. Ingeneral, large resources are transferred from the gateway that isperceived to have the highest throughput among the gateways holding anup-to-date replica of the resource, as long as transfer from thisgateway is permitted by the applicable administration directives. Asmentioned above, transfers are preferably prioritized by the receiverapplication layer rather than by the adaptation layer.

[0406] Preferably, adaptation layer 45 uses an adaptive block size fortransferring data over the WAN. The block size depends on the currentlyavailable bandwidth and latency of the link connecting the two VFNgateways that are communicating, and preferably is bound by minimum andmaximum size parameters. The block size is typically independent of theactual size of the resource being transferred.

[0407] Typically, when a resource is being transferred pursuant to afile system request processed by receiver application layer 40, theblock size is larger than that which would be used in the original filesystem request. The original request was optimized for efficient use ofthe LAN, which has negligible latency and high-bandwidth. Increasing theblock size optimizes the request for efficient use of the WAN, whichtypically is characterized by substantial protocol latency and overhead.Block size is preferably set to the equivalent of at least a fewseconds' data transfer, in order to allow TCP rate control sufficienttime to converge. Despite this larger block size, redundant data isgenerally not transmitted over the WAN, since blocks are stored in theVFN receiver's cache for later use, as described above.

[0408] Preferably, the computation of the block size is performed usingthe following rule:

[0409] Block size equals RTD*REE, but not less then 4 kilobytes (asmessage overheads makes lower values inefficient), and not more than apredetermined value such as 1 megabyte (otherwise caches may quicklyoverflow):

[0410] RTD equals the round-trip delay (in seconds) between the VFNreceiver and VFN transmitter, and REE equals the end-to-end transferrate (in bytes per second). RTD and REE are preferably dynamicallycalculated using measurements taken from past connections, to whichexponential window averaging is applied. These parameters are availablefrom standard TCP algorithms. Alternatively, RTD and REE may beconfigurable static parameters.

[0411] The calculated quantity RTD*REE represents the number of bytesthat can be transmitted over an end-to-end connection in a singleround-trip cycle. The function above bounds this quantity between aminimum of 4 kilobytes and a maximum of one megabyte, although larger orsmaller limits may alternatively be used. An isolated, single userrequest cannot be served in less then RTD seconds, regardless of howsmall the requested resource is. The function balances twoconsiderations. First, it is inefficient to transfer a very large blockthat will increase the client latency much above the RTD. Second,smaller blocks utilize the WAN connection inefficiently. The choice of a4 kilobyte minimum block size reflects HTTP and VFN WAN protocoloverheads, and the choice of a one-megabyte maximum block size reflectsa reasonable maximum cache block size. Because the adaptation layerpreferably uses parallel connections and connection pipelining, thisblock size is generally not an efficiency bottleneck, even in moreloaded operations.

[0412] Adaptation layer 45 preferably uses a heuristic for performinglazy read-ahead of files and file blocks in order to pre-position filesand file blocks that are likely to be needed by a user application. (Aclient application often accesses only certain blocks of a large file.This block access is supported by the VFN system, both by the VFNreceivers when serving resources, and during inter-VFN gatewaycommunications.) Preferably, an algorithm analyzes real-time file usagepatterns to detect sequential access patterns, which are common in manyapplications.

[0413] Preferably, adaptation layer 45 adapts its detection ofsequential access patterns according to the file type of the resource.This adaptation is beneficial because some file types are characterizedby a particular access pattern that differs from typical sequentialaccess. Such files typically include a data structure that can be usedfor accessing data internal to the document. Examples of such datastructures include the directory structure used in ZIP files (listingfile contents and attributes), a document map in Adobe® PortableDocument Format (PDF) files, and, for directory operations, Windowsicons associated with an executable file for displaying the executablefile in a listing. Adaptation layer 45 preferably tracks access to thesefiles (either at the VFN receiver or VFN transmitter), collects accesspatterns, and utilizes the access patterns to perform more predictivepre-positioning. Preferably, fixed patterns in a file are detected.Alternatively or additionally, the adaptation layer (preferably in theVFN transmitter) comprises application-specific handlers that analyzeand push read-ahead blocks. For example, ZIP directories and Windowsicons may be referenced using an in-file offset listed in specificlocations of the file.

[0414] When particular usage patterns are detected, the VFN receiverattempts to pre-position additional blocks of the same file before theyare requested by the VFN receiver's client. Additionally, the read-aheadalgorithm preferably exploits common access patterns in each networkfile system, such as access patterns resulting from a folder-browsingrequest. Resources are pre-positioned if their request is found to behighly correlated with recent requests for other resources. As notedabove, the algorithm takes into account available bandwidth by assigninga low priority to read-ahead transfers, thus avoiding delays in transferof data for on-demand requests. Preferably, the balance of a file ispre-positioned after a certain number sequential reads of the file,typically five such reads. This threshold reflects the observation thatafter five sequential reads, the probability of full file sequentialaccess is greater than 80%.

[0415] Additionally, the VFN receiver may attempt to pre-position filesby detecting access patterns that span multiple files, such asapplication-related files. Such patterns are preferably detected usingapplication- or application-class-specific algorithms. For example, arule might be formulated pursuant to which when a file of a certain typeis first read, all files with the same base-name in another relateddirectory are pre-fetched. Alternatively or additionally, self-learningalgorithms for detecting correlations may be used, as are known in theart.

[0416] Preferably, adaptation layer 45 uses compression for filetransfer between the VFN transmitter and the VFN receiver. Mostpreferably, the VFN system is pre-configured with a default set of filetypes that are known to be compressible. Files of these types areautomatically compressed if greater than a certain minimum size.Additionally, a VFN administrator can further configure the VFN systemto compress files by certain other criteria, such as file type, size, orlocation. For example, the VFN system can be configured to compress allMicrosoft Word files greater than 200 kilobytes. Preferably, theadaptation layer utilizes adaptive configuration to vary the parametersfor applying compression based on current WAN performance andconstraints. For example, compression may be applied more aggressivelyduring business hours when WANs are generally more highly utilized.Preferably, zlib compression is used, although other compression toolscan be used, as well.

[0417] To implement compression, the VFN receiver preferably indicatesthat compression should be attempted on a requested file by marking sucha request in the VFN request header sent to the VFN transmitter. Uponsuch a compression request, the VFN transmitter compresses the file ontoa temporary local copy and compares the size of the compressed file withthe original file. For real-time transfer requests, the compressedversion is used only if the overall responsive time is decreased, takinginto consideration the decompression processing latency. Alternatively,the decision to return the compressed version is based on thecompression percentage achieved (for example, at least 30%). Otherwise,the uncompressed version is returned. For pre-positioning transfers,compression is triggered if the compressed version is smaller than theuncompressed version. In all cases, the VFN transmitter marks whetherthe file is compressed in the transmitter's response header.

[0418] Adaptation layer 45 preferably breaks large files into blocks fortransfer via parallel TCP connections, whereby multiple threads ofadaptation layer 45 on the VFN receiver open sockets and fetch differentparts of the file concurrently. Parallel connections typicallysignificantly enhance effective throughput over a WAN link. The maximumnumber of concurrent TCP connections K is either pre-configured oradaptively set based on observed throughput gain. The pre-configureddefault for K is preferably 4, similar to a typical Web browser default.Alternatively, the adaptation layer of the VFN receiver attempts toincrease the number of concurrent connections to the VFN transmitteruntil no more overall throughput gain is observed. If no overallbandwidth decrease is observed after the termination of a connection, Kis decreased by 1. Typically, setting K too high increases latencywithout affecting total bandwidth. Additionally, K can be reduced bythrottling, as described below.

[0419] Adaptation layer 45 preferably implements throttling to controlthe maximum bandwidth used by the VFN system over a WAN connection.Throttling is desirable so that VFN data does not cause networkcongestion that interferes with the throughput of non-VFN traffic.Throttling is particularly beneficial when there is asymmetry betweenthe connection speeds of interacting VFN gateways.

[0420] The throttling mechanism is preferably based on the weeklyconfiguration (per weekday per hour) of two bandwidth parameters: K (themaximum number of connections) and the total bandwidth consumed by theVFN. The total number of connections generally reflects the relativeamount of bandwidth consumed by the VFN in relation to other TCP-basedapplications, because multiple TCP connections originating from the samesite will generally distribute the bandwidth evenly in the absence of IPquality of service mechanisms. Therefore, a small value of K willthrottle VFN system traffic during WAN peak traffic periods. Preferably,the VFN system additionally provides a configurable total bandwidthlimit or socket limit, which bounds the total bandwidth consumed by theVFN system irrespective of other applications. Such limitations may bevaried over different periods of the day or on a weekly basis.Optionally, only VFN receivers monitor and throttle their bandwidth use,while VFN transmitters, which are passive, do not regulate theirresponse rates. Throttling preferably is used with queues in order togive preference to higher priority requests over lower priorityrequests.

[0421] Adaptation layer 45 preferably uses pipelining, whereby theadaptation layer at the VFN receiver issues multiple requests for blocksbefore waiting for responses on the socket. This mechanism generallyreduces the overall response time of the VFN system. The adaptationlayer retries failed transfers, and transfers only the remaining portionof a resource after a failed transfer.

[0422] Adaptation layer 45 preferably uses IP multicasting in order tomore efficiently perform large-scale replication. Reliable multicastingmechanisms are used, preferably including forward error-correctiontechniques, as are known in the art, in order to save retransmissionbandwidth and delays.

[0423] Adaptation layer 45 is preferably self-adapting to differentsituations in order to maximize efficiency. For example, when anup-to-date large file is available at more than one VFN transmitter, theVFN receiver preferably extends the methods of parallel transferdescribed above to address multiple sources. The VFN receiver attemptsto transfer the file by concurrently transferring blocks of the filefrom all of the administratively-permitted VFN transmitters. Sourcepriority is based on transfer-rate statistics, administrativedirectives, and source identity information recorded in the VFNmetadata. Multi-source parallel transfer is often particularly usefulwhen a WAN is characterized by links with asymmetric and/orheterogeneous rates. In such a case, faster links typically dominate thetransfer.

[0424] The VFN receiver typically initiates a new block request eachtime a block transfer is completed, thereby utilizing the bandwidthavailable from the faster connections. When all blocks have beenrequested, but some blocks have yet to be received after a certaintimeout period, these blocks are requested again over ahigher-performance connection.

[0425] Adaptive routing algorithms are preferably used by adaptationlayer 45 in order to provide faster file transfer. These algorithmsdetermine which remote VFN transmitter is the best source of theresource to be transferred. Each VFN gateway maintains a ranking of itsconnection to all other VFN gateways based on continuous trafficmeasurements on each link. When transferring a small file, thedestination VFN gateway requests the file from the highest-ranked VFNgateway that holds an up-to-date replica of the file. When transferringa large file, the destination VFN gateway transfers the file from ahigh-throughput source VFN gateway holding an up-to-date replica of thefile, or, alternatively, from more than one source gateway usingparallel transfer, as described above. For this purpose, the ranking ofVFN gateways is preferably determined by checking replicated LARinformation, as described above.

[0426] Adaptive routing can significantly accelerate file transfer, forexample, when a destination VFN gateway has a high-speed connection tothe WAN, and the requested file is available at several VFN gatewayswith low-speed connections to the WAN. File transfer can also besignificantly accelerated when a file is transferred to a local VFNgateway from a remote site over a low-speed connection, and the localVFN gateway is connected to other VFN gateways over high-speedconnections. In this case, if one of these other VFN gateways requeststhe file, the adaptive routing algorithm favors the local VFN gateway asthe source of the file. For example, a small branch office in Haifa canrequest files that reside in the Santa Clara headquarters of anenterprise via a larger branch office of the enterprise in Tel Aviv. Asa result, files are transferred over the slow transatlantic link onlyonce, and can then be used by both branch sites. To implement schemes ofthis sort, VFN receivers are preferably able to accept and respond toHTTP requests from other VFN receivers, resulting in a chain ofconcatenated VFN receivers.

[0427] Adaptive routing can also be used to choose less expensiveconnections that are available on the WAN. Additionally, the adaptiverouting algorithm can be used to increase VFN system availability andreliability in cases of temporary WAN disconnections or slowdowns.

[0428] Adaptive routing is preferably implemented using hierarchicalcaching and virtual directories. With hierarchical caching, VFN siteswith higher long-distance bandwidth serve local sites (for example, aTel Aviv site can serve a Haifa site from the Tel Aviv site's cachedreplicas). Virtual directories provide information regarding whichresources and resource versions are currently available. Forconsistency, cached resources are used only if found to beversion-consistent with the corresponding file metadata retrieved fromthe origin site.

[0429] Preferably, adaptation layer 45 applies delta compression forupdating files that have been previously pre-positioned or cached. Therequest for such a file includes a description of the current versionheld by the VFN receiver, including delta compression signatures, whichuse a cryptographic signature (preferably a collision-free one-way hashfunction) to convey information about the content of blocks currentlyheld by the VFN receiver. Based on this information, the adaptationlayer at the VFN transmitter transmits only the delta (missing orchanged parts) between the latest version of the requested file and theout-of-date version of the same file held by the VFN receiver. Theversions and delta information are preferably managed so that additionalfile versions are not required for delta compression. Delta compressionby adaptation layer 45 can also be used to efficiently handle insertionand deletions in mid-file, and can be optimized for multiple VFNgateways sharing the same resource.

[0430] Use of delta compression is often particularly advantageous forwhole file transfer, such as during pre-positioning, and for read-ahead.Preferably, the VFN system is configured to delta compress only certainfiles, based on criteria such as type, size, or location. Additionally,other compression techniques, as described above, can be applied to thegenerated delta files. Delta transfer may also be used for on-demandtransfers.

[0431] Preferably, delta compression is applied using file versioncorrelation and/or using global compression. Compression based on fileversion correlation uses a delta compression algorithm, such as rsync(an open-source utility), to locate and reuse file chunks that areshared by different file versions of a file for which a transfer hasbeen requested. The VFN transmitter thus does not need to retransfer thedata in any such reused blocks. Global compression extends the reuseconcept to identify shared chunks among multiple files, ideally acrossthe entire file system. Preferably, a utility such as LBFS (LowBandwidth File System) is used to implement global compression. Ineither compression method, when a file needs to be transferred from oneplace to another, its chunk signatures are sent. In response, directionsfor creating the new version are received, such as whether to use acached chunk or to transfer the data from the VFN transmitter. Bothcompression methods are known in the art, where they are typically usedfor offline, whole file transfers.

[0432] Write

[0433] Adaptation layer 45 supports inter-VFN gateway write operationsrequested by clients 28. In a preferred embodiment of the presentinvention, the VFN system uses a write-back cache mechanism, wherebyupdated files are cached at the last writer's VFN receiver. The use ofsuch a mechanism transforms an apparently synchronous operation into anasynchronous write operation at the adaptation layer. This approachsignificantly reduces the response time of VFN system 20 to user writes,while the write-back mechanism automatically creates multiplesynchronized copies of resources.

[0434] To implement write-back caching, each VFN receiver maintains alog of changes made locally to the resource in question. Preferably,changes are synchronized with the peer VPN transmitter upon theoccurrence of one or more of the following events, based onconfiguration settings:

[0435] at the time of lease renewal, as described above;

[0436] after a certain amount of time has passed from caching of thefirst write request. Preferably, the default maximum delay is 30seconds, which is the same as the standard NFS client write bufferdelay;

[0437] after a certain amount of time has passed since the most recentsynchronization;

[0438] when the local VFN receiver buffer is exhausted;

[0439] when files are closed; and/or

[0440] when file sizes change.

[0441] The optimal write cache size is typically calculated in a similarmanner to read block size, as described above. Updates to file metadataare synchronously transferred to the source VFN transmitter, in order toprovide other clients with up-to-date directory information.

[0442] Write-back caching generally improves performance by eliminatingthe overhead associated with write-through caching over a WAN, whilesimultaneously bounding the amount of time that can pass before changesare propagated to other VFN gateways. Optionally, a VFN receiver candelay and batch write-backs over multiple lease renewals, or until thereceipt of an revocation from the lease manager of the peer VFNtransmitter. Preferably, write-back is disabled (resulting inwrite-through) when there are multiple holders of write leases for aresource, as described above. Write-back may be disabled, for example,by setting a zero-duration timeout period on the write leases.Preferably, all operations that change directory structure or contentsare performed in write-through mode.

[0443] Preferably, adaptation layer 45 utilizes compression, parallelconnections, throttling, and routing for writing in substantially thesame manner as for reading. When the consistency protocol permits theuse of write-back, delta compression can be performed at the time thefile is closed, as described above. Optionally, to implement deltacompression on write-back, the adaptation layer on the VFN receiversends its peer adaptation layer on the VFN transmitter instructionsregarding how to create the new file version from the delta-compressedversion.

[0444] Adaptation layer 45 is preferably pre-configured or configured bya VFN administrator not to copy temporary files to the origin fileserver 25 unnecessarily. Temporary files include files that aregenerated by an application for local backup and are removed when theapplication terminates.

[0445] Open/Close

[0446] The VFN system preferably enforces native file system accessrights to files and directories transparently, including support ofaccess control list (ACL) checking at the local VFN receiver. Suchaccess rights are enforced both for on-demand resource access and foraccess to resources that have been pre-positioned or cached. Thissupport is possible because the relevant file metadata has usually beenpre-positioned or cached in the VFN receiver, as described above.Authorization is therefore checked locally at the VFN receiver. The VFNreceiver preferably caches and negative-caches authorization results toenhance system performance.

[0447] The VFN receiver preferably supports share level security,allowing access to whole file trees when the share (or mount) isinitially mapped. For non-native requests, the VFN system providesheuristics that permit a reasonable level of access without compromisingsecurity guarantees of the native file system security model. Requeststo set access permissions are also supported.

[0448] Preferably, the VFN transmitter is configured to keep a resourceon file server 25 open for a certain amount of time after the resourcehas been closed by client 28 of the VFN receiver. During this period, anopen request from any of the clients of any of the peer VFN receivers ofthe VFN transmitter is handled locally by the VFN transmitter, withoutthe need to interact with file server 25. This approach can improve VFNsystem performance when there are multiple open and close requests forthe same resource.

Application Transport Layer

[0449] Application transport layer 46 is a framework for activatingremote services used by the higher VFN application layers (adaptationlayer 45 and VFN transmitter and receiver application layers 42 and 40).The application transport layer provides services that enable thedifferent application layers to transfer data to and from one another.

[0450] Remote services are activated by bidirectionally transferringremote procedure call (RPC) messages between a client applicationtransport layer (“RPC client”) on one VFN gateway and a serverapplication transport layer (“RPC server”) on a second remote VFNgateway. Preferably, the application transport layer functionsasymmetrically, whereby the RPC client sends RPC request messages to theRPC server, and the RPC server responds by sending RPC response messagesto the RPC client. RPC request messages include the request and anynecessary parameters, and RPC response messages include any necessaryreturn values, such as a file. RPC requests, RPC responses, parameters,and return values are preferably Java objects, in order to supportJava-based implementations of the higher application layers.Alternatively, the application transport layer functions symmetrically,whereby in addition to the RPC client issuing requests to the RPCserver, the RPC server can issue requests to the RPC client. In such asymmetric implementation, the RPC server can connect to the RPC clientat a later time in order to respond to an earlier request from the RPCclient.

[0451] The application transport layer is preferably implemented in sucha manner that the higher application layers are not aware of the detailsof the implementation, including the choice of network protocols. Theapplication transport layer provides a simple API to its higher-levelclients, which hides complexities, such as socket selection andresumption after disconnect. Preferably, the application transport layerprovides communication-related properties to higher application layers,such as remoteIP and remoteWD. Higher-application layers preferably arethus able to assign globally unique identifiers to their RPC requests.The application transport layer may use these identifiers to providemessage correlation between RPC server replies and RPC client requests.

[0452] Preferably, the application transport layer supports reliable RPCbetween the RPC client and RPC server, whereby both sides must agree onthe result of a method call, such as file locking. Each side is aware ofwhich messages it has received and delivered to higher applicationlayers. The application transport layer enables retransmission oftimed-out requests and the recognition of such retransmissions by therecipient. Alternatively, retransmission may be implemented in a higherapplication layer, between application transport layer 46 and adaptationlayer 45.

[0453]FIG. 12 is a block diagram that schematically illustrates detailsof application transport layer 46, in accordance with a preferredembodiment of the present invention. Application transport layer 46comprises a server application transport layer 168 (“RPC server”) and aclient application transport layer 170 (“RPC client”). Serverapplication transport layer 168 comprises an RPC server control layer160, which corresponds to an RPC client transport control layer 162 ofclient application transport layer 170. These RPC control layers provideservices directly to adaptation layers 45 located at VFN gateways remotefrom one another.

[0454] Both the server and client application transport layers furthercomprise a data encapsulation layer 164 and a functional transport layer166. The data encapsulation layer provides services for encoding anddecoding data passed in RPC messages. Preferably the encapsulation isimplemented using standard languages and protocols, such as XML andMIME.

[0455] Transport layer 166 handles WAN connectivity and the actualtransfer of RPC messages between the client and server applicationtransport layers. Preferably, functional transport layer 166 alsoimplements security and privacy of data, as described below. For thesepurposes, the functional transport layer is most preferably implementedover HTTP, and in particular over HTTP 1.1. The use of HTTP 1.1simplifies the deployment of the VFN system in enterprises that allowaccess to their sites only via HTTP and only through a single port. Inaddition, most HTTP proxies and firewalls support HTTP 1.1, and thosethat do not support HTTP 1.1 may support persistent connections andother features of HTTP 1.1.

[0456] The implementation of the functional transport layer and allhigher layers, however, are preferably abstracted away from the specificHTTP functional transport protocol. For this reason, RPC messagestructure, serialization, encoding, registration, and dispatch are alldecoupled from the functional transport layer. Thus, functionaltransport layer 166 can be implemented using other protocols, such asFTP or TCP (particularly when VPNs are used). If FTP is used, it ispreferably configured to support authorization and credentials.

[0457] Application transport layer 46 preferably provides synchronousservice to the protocol layers above it (although internally the RPCcalls may be executed asynchronously to provide a more efficient andfair implementation). Higher layers may implement out-of-ordermechanisms using submit/poll against the remote service handlers.Alternatively, other service patterns are supported, such aspublish-subscribe, multicast delivery, or asynchronous notification, asare known in the art. In implementations that support asynchronousrequests, the application transport layer notifies the higher-levelapplication when a requested transfer is complete.

[0458] RPC client and RPC server are initialized as system services,which provide an RPC client context object and an RPC server contextobject, respectively, to the higher protocol layers. The RPC client andRPC server use similar RPC message structures, with differences asdescribed below.

[0459] Because application transport layer 46 may provide the sameservice on several remote servers, and each RPC server may offer morethan one service, an RPC request preferably identifies the remote RPCserver to which it is addressed, the identity of the remote service itrequires, and the identity of the method being called. Remote RPCservers are preferably identified using hostnames or logical names, in amanner similar to that of path or dot-notations used in URLs for HTTP.The identification of remote RPC servers may be included in the VFNsystem-wide configuration, or alternatively, a hard-coded defaultpath+port may be used for each host name. Preferably, the UniformResource Name (URN) of an RPC server is not based on HTTP, in order tomaintain abstraction away from HTTP. The RPC client and RPC serverpreferably use the same name for each service.

[0460] When logical names are used for RPC servers or services, the RPCframework of application transport layer 46 preferably provides atranslation mechanism that uses configuration data to translate logicalnames into physical (hostname+path) server and service names. Thistranslation capability provides a layer of abstraction which enablesloosely coupled client and server parts. It also allows the VFN systemto implement different services with the same logical name on differentPRC clients.

[0461] Application transport layer 46 preferably provides a genericmechanism for setting local and remote properties, in order control thebehavior of the application transport layer, including its sub-layers.Some of these properties are user-defined. The user-defined propertiesare assigned unique names and are preferably not passed as RPC requestparameters or RPC response return values. Other properties are genericand are automatically created by RPC control layers 160 and 162, such asClient ID, Server ID, Local IP addresses, and Remote IP addresses.

[0462] Secure transfer over the Internet is also provided by applicationtransport layer 46 when the VFN system is not operating over a secureVFN. Security is preferably provided by encrypting all data to betransferred with SSL and by using strong authentication. In thissituation, a portion of VFN transmitter 52, including repositoryconnector layer 50, resides inside the network firewall, in order totransfer resources into the VFN transmitter. Another portion of the VFNtransmitter, including VFN HTTP server 78, resides in the DemilitarizedZone (DMZ) between the Internet and the network firewall, in order tocommunicate over the Internet. A similar arrangement applies to the VFNreceiver.

[0463] Additional security may be provided by allowing HTTP access onlyfrom specified IP addresses, and/or adding special headers that identifyVFN components, including a signature for privatization. Alternativelyor additionally, certificates, such as client and/or SSL certificates,and/or credentials, such HTTP basic or digest authentication, are used.

[0464] Encapsulation

[0465] Data encapsulation layer 164 provides services for encoding anddecoding objects passed as RPC requests, RPC responses, parameters, andreturn values in RPC messages (referred to collectively herein as “RPCparameters”). As mentioned above, RPC parameters are preferably Javaobjects. Before a Java object can be sent to a remote application, itmust be converted to an XML or binary representation. This conversion iscommonly referred to as serialization, or “encoding.” The XML or binaryrepresentation is passed to the remote application, which converts itback to the original Java object. This conversion back is commonlyreferred to as deserialization, or “decoding.” RPC client 170 and RPCserver 168 use serializers to perform encoding, and deserializers toperform decoding. Preferably, serializers and deserializers are Javaobjects that implement appropriate Java interfaces, as described below.

[0466] Each object class, or type, preferably has its own serializer anddeserializer. Data encapsulation layer 164 provides several genericserializers and deserializers for common object types, such as String,Integer, Float, Boolean, and byte[]. These generic serializers anddeserializers may be provided for both XML and binary encapsulation.Custom serializers and deserializers are preferably provided for eachobject type that a higher application layer may include as an RPCparameter. These custom serializers and deserializers are preferablyregistered in a registry (called RPCMappingRegistry). The dataencapsulation layer and higher application layers use this registry tolook up appropriate serializers and deserializers for non-generic objecttypes. An RPC context registration service is used to registernon-generic parameter types in this registry. Additionally, specialserializers and deserializers are preferably provided to allow thepassing of unknown object types.

[0467] A preferred Java interface of the RPCMappingRegistry is shown inListing 1. One or more Java classes implementing this interface are usedby applications to register and look up serializers and deserializersfor both generic and non-generic object types.

Listing 1

[0468] public void mapXMLType(String elementType, Class javaType,XMLSerializer xs, XMLDeserializer xds);

[0469] public void mapBinaryType(String elementType, Class javaType,BinarySerializer bs, BinaryDeserializer bds);

[0470] public XMLSerializer querySerializer(Class javaType) throwsIllegalArgumentException;

[0471] public XMLDeserializer queryDeserializer(String xmlType) throwsIllegalArgumentException;

[0472] public String queryElementType(Class javaType) throwsIllegalArgumentException;

[0473] public Class queryJavaType(String elementType) throwsIllegalArgumentException;

[0474] A preferred Java interface of an XML serializer is shown inListing 2. Serializers for encoding object parameters to XML implementthis interface.

Listing 2

[0475] public void serialize(Class javaType, Object src, Writer output,RPCMappingRegistry rpcmr) throws IllegalArgumentException, IOException;

[0476] public int getlength(Class javaType, Object src,RPCMappingRegistry rpcmr) throws IllegalArgumentException,UnknownLengthException;

[0477] A preferred Java interface of an XML deserializer is shown inListing 3. Serializers for decoding XML-encoded parameters to Javaobjects implement this interface.

Listing 3

[0478] public Object deSerialize(String elementType, Node src,

[0479] RPCMappingRegistry rpcmr) throws IllegalArgumentException;

[0480] A preferred Java interface of a binary serializer is shown inListing 4. Serializers for encoding object parameters to a sequence ofbytes implement this interface.

Listing 4

[0481] public void serialize(Class javaType, Object src, OutputStreamoutput) throws IllegalArgumentException, IOException;

[0482] public int getLength(Class javaType, Object src) throwsIllegalArgumentException, UnknownLengthException;

[0483] A preferred Java interface of a binary deserializer is shown inListing 5. Serializers for decoding binary parameters to Java objectsimplement this interface.

Listing 5

[0484] public Object deSerialize(String elementType, InputStream input)throws IllegalArgumentException;

[0485] RPC Message Structure

[0486] In a preferred embodiment of the present invention, RPC messages,including requests and responses, are passed using XML, preferably usinga variant of the Simple Object Access Protocol (SOAP). When an RPCmessage includes at least one parameter, return value, or property ofbinary type, and the binary data is larger than a certain configurablesize, the RPC message is preferably encoded in MIME Multipart/RelatedContent-Type, with the binary data included as an attachment. The use ofMIME Multipart/Related standard separates the request/reply XML portionof the RPC message from the binary data portion, such as a file includedin a response, in order to provide efficient transfer of binary data.Binary data of a smaller size is preferably base64 encoded. XML ispreferably implemented using Content-Type: text/xml.

[0487] A preferred structure of an RPC message using MIMEMultipart/Related is shown in Listing 6:

Listing 6

[0488] MIME-Version: 1.0 Content-Type: Multipart/Related;boundary=MIME_boundary; type=text/xml; start=″rpc_message″--MIME_boundary Content-Type: text/xml; charset=UTF-8Content-Transfer-Encoding: 8bit Content-ID: rpc_message <?xmlversion=′1.0′ ?> <RPCEnvelope> <RPCBody> . . <binary. href=″part1″/> . .</RPCBody> </RPCEnvelope> --MIME_boundary Content-Type: byte[ ]Content-Transfer-Encoding: binary Content-Length: xxx Content-ID: part1. . .binary byte[ ] data --MIME_boundary--

[0489] As described above, RPC requests and RPC responses are preferablyJava objects. Java classes implementing the following RPC request andRPC response interfaces are preferably used for RPC requests and RPCresponses, respectively. A preferred Java interface of an RPC request isshown in Listing 7:

Listing 7

[0490] public void setLocalProperty(String optName, Object opt);

[0491] public Object getLocalProperty(String optName);

[0492] public Enumeration getLocalPropertyNames(String optNamePrefix);

[0493] public Object getRemoteProperty(String optName);

[0494] public void setRemoteProperty(String optName, Object opt);

[0495] public Enumeration getLocalPropertyNames(String optNamePrefix);

[0496] public void setMethodName(String name);

[0497] public String getMetodName( );

[0498] public void setMethodParameters(Object[] params) throwsIllegalArgumentException;

[0499] public Object[] getMethodParameters( );

[0500] A preferred Java interface of an RPC response is shown in Listing8:

Listing 8

[0501] public void setLocalProperty(String optName, Object opt);

[0502] public Object getLocalProperty(String optName);

[0503] public Enumeration getLocalPropertyNames(String optNamePrefix);

[0504] public Object getRemoteProperty(String optName);

[0505] public void setRemoteProperty(String optName, Object opt);

[0506] public Enumeration getLocalPropertyNames(String optNamePrefix);

[0507] public void setReturn Values(Object[] retvals) throwsIllegalArgumentException;

[0508] public Object[]getReturn Values( ) throws RPCException;

[0509] public void setRPCException(RPCException rpcExp);

[0510] Preferably each RPC request message is assigned a uniqueidentification number for control and debugging purposes. RPC responsesinclude the identification number of the corresponding RPC request.

[0511] RPC Client

[0512]FIG. 13 is a block diagram that schematically illustrates furtherdetails of client application transport layer 170, in accordance with apreferred embodiment of the present invention. The client applicationtransport layer (“RPC client”) is initialized as a system service thatprovides an RPC client context object to the VFN system. Receiverapplication layer 40 and adaptation layer 45 use the RPC client contextin accessing their corresponding remote peer layers.

[0513] A preferred Java interface of the RPC client context is shown inListing 9:

Listing 9

[0514] public RPCRequest getRPCRequest( );

[0515] public RPCResponse sendRPCRequest(RPCRequest req);

[0516] public void mapXMLType(String elementType, Class javaType,XMLSerializer xs, XMLDeserializer xds);

[0517] public void mapBinaryType(String elementType, Class javaType,BinarySerializer bs, BinaryDeserializer bds);

[0518] public String getRPCVersion( );

[0519] Adaptation layer 45 communicates with the RPC client through RPCclient control layer 162, which comprises an RPC request factory 172, anRPC response factory 174, and an RPC protocol manager 176. The RPCrequest and response factories are used to hide the exact objectcreation and destruction details (for example, whether an object wasreused from a pre-allocated pool or newly created) and the concreteimplementation (so that the user of an object is aware only of theinterface returned by the factory and not the concrete classimplementation, which may be varied.) RPC protocol manager 176preferably handles network conditions (such as application failures,lost messages, out-of-order delivery, and method dependencies) in ageneric manner. The RPC protocol manager includes, for example, aretransmission mechanism on the client side, and a response cache on theserver side to aid in implementing at-most-once semantics for somerequests.

[0520] The RPC client further comprises data encapsulation layer 164 andfunctional transport layer 166, as noted above, as well as an RPCmanagement agent 178. RPC management agent 178 provides a managementinterface to the RPC component. This interface includes, for example,the host name and port number of each RPC server, the transport buffersizes, and maximum and minimum number of connections to open with eachendpoint. The RPC management agent is integrated with the component-widemanagement infrastructure of the entire VFN gateway. This architecturesupports both blocking and non-blocking implementations of theapplication transport layer.

[0521]FIG. 14 is a flow chart that schematically illustrates a methodfor processing an RPC request by RPC client 170, in accordance with apreferred embodiment of the present invention. This method is invokedwhen the RPC client receives a request for RPC services from a higherprotocol layer, at an RPC request step 200. The RPC client requests anempty RPC request object from the RPC context object, and sets themethod name and parameters of the RPC request, at a parameter settingstep 202. The RPC client sets local and remote properties, as describedabove, at a local property setting step 204 and remote property settingstep 206, respectively.

[0522] The RPC client then encodes the RPC request using dataencapsulation layer 164, as described above, at an encoding step 208.The RPC client sends the RPC request to the appropriate RPC server usingfunctional transport layer 166, at a send RPC request step 210. The RPCclient waits for an RPC response, at a RPC response wait step 212, untilthe RPC client receives the RPC response, at a receive RPC response step214. The RPC client decodes the RPC response using data encapsulationlayer 164, at a decoding step 216. The RPC client then returns theresponse to the requesting higher protocol layer, at an applicationresponse step 218.

[0523] Optionally, the operation of sending an RPC request and receivingthe RPC response may be non-blocking. In such a case, the RPC clientmust guarantee that the parameters it passed to the RPC server will notbe modified until the RPC request is actually sent. RPC client 170 ispreferably also capable of controlling RPC sessions and invokingretransmits when required, as well as canceling (preempting) bothblocking and non-blocking sessions when required.

[0524] RPC Server

[0525]FIG. 15 is a block diagram that schematically illustrates detailsof server application transport layer 168 (“RPC server”), in accordancewith a preferred embodiment of the present invention. The RPC server isinitialized as a system service which provides an RPC server contextobject for use by all RPC services in the VFN gateway. Alternatively,the RPC server may be deployed as a servlet or a URL handler, and isinitiated as such. RPC services use the RPC server context forregistration and for other functions, such as registering serializersand deserializers, security management, authentication, privatization,and authorization control. RPC services are provided by handlers.Preferably, the handlers run in the same process as the RPC server.Alternatively, handlers may run remotely and may be made availablethrough the use of Java Remote Method Invocation (RMI) orapplication-specific protocols. Handlers preferably implement theRPCServerInterface Java interface as shown in Listing 10:

Listing 10

[0526] public void handleRPC(RPCRequest req, RPCResponse res);

[0527] RPC services are explicitly registered in an RPC servicesregistry 182, identifying the specific services they provide. Eachhandler is preferably assigned a unique identifier for its service.

[0528] A preferred Java interface of the RPC server context is shown inListing 11:

Listing 11

[0529] public void mapService(String prefix, RPCServiceHandler service);

[0530] public void sendRPCResponse(RPCResponse res);

[0531] public void mapXMLType(String elementType, Class javaType,XMLSerializer xs, XMLDeserializer xds);

[0532] public void mapBinaryType(String elementType, Class javaType,BinarySerializer bs, BinaryDeserializer bds);

[0533] public String getRPCVersion( );

[0534] RPC server 168 responds to RPC requests from RPC client 170. RPCserver control layer 160 of the RPC server comprises an RPC servicedispatcher 180, which dispatches RPC services pursuant to RPC requestsreceived from RPC clients, as described below with reference to FIG. 16.RPC server control layer 160 further comprises an RPC protocol manager176, as described above in connection with RPC client control layer 162.As noted above, the RPC server also comprises data encapsulation layer164 and functional transport layer 166, as well as RPC management agent178. This architecture supports both blocking and non-blockingimplementation of the application transport layer.

[0535]FIG. 16 is a flow chart that schematically illustrates a methodfor processing an RPC request by RPC server 168, in accordance with apreferred embodiment of the present invention. The RPC server waits forRPC requests, preferably on open HTTP sockets, at an RPC request waitstep 220, until an RPC request is received, at an RPC request receiptstep 222. The RPC server decodes the RPC request using dataencapsulation layer 164, at a decoding step 224. If an error occurs indecoding the RPC request, at an error checking step 242, the RPC servergenerates an empty RPC response, at an empty response step 244. The RPCserver populates the RPC response with an error value or an emptyresponse, at an error creation step 246, and proceeds to step 238 below.

[0536] On the other hand, as long as data is extracted successfully atstep 224, the RPC server creates a service request object using thedecoded data, at a service request object creation step 226. The RPCserver finds the appropriate RPC service by looking up the receivedmethod name in RPC services registry 182, at a service lookup step 228.The RPC server generates an empty RPC response object for the outgoingresponse, at an empty RPC response generation step 230, and passes thisempty object and the service request object to the appropriate RPCservice handler, at a service dispatch step 232. When the requesthandler completes the requested service, the handler returns the requestand response tuple to the RPC server. The request and response arepassed by reference between all application layers in a VFN gateway,including between the request handler and the RPC server, therebyavoiding the overhead of copying data when crossing layer boundaries.

[0537] After receiving a response from the RPC service handler, the RPCserver processes the RPC request and response, at a processing step 234.Based on the response from the RPC service, the RPC server sets the RPCreturn values for the response to be sent to RPC client 170, at a returnvalue setting step 236. Using data encapsulation layer 164, the RPCserver encapsulates the RPC response, at an encapsulation step 238, andsends the RPC response to the requesting RPC client, using functionaltransport layer 166, at a send response step 240. Preferably, onlyreturn values or a single exception, and remote service properties arereturned from the RPC server. Preferably, method parameters areread-only, and the handler explicitly copies any modified objects to thereturn values set, thereby avoiding copying all parameters and savingheap space.

[0538] Functional Transport Layer

[0539] The choice of which underlying transport protocol to use infunctional transport layer 166 is driven by network constraints,particularly firewall policies. TCP may be preferable from anengineering and performance point of view because it is nativelybidirectional and generally incurs less overhead than HTTP. However, inmany cases it is preferable to use HTTP because of its ability to passthrough most firewalls without requiring custom network configurationand security policy decisions. Preferably, functional transport layer166 provides built-in resumption of failed connections. When HTTP isused as the underlying transport protocol, layer 166 typically usesstandard HTTP proxies, and is proxy-aware in order to disable anycaching of inter-VFN communications that standard HTTP proxies mayattempt to automatically implement. Alternatively or additionally, thefunctional transport layer may be based on SOCKS gateways, as are knownin the art. Preferably, layer 166 also produces metrics that can be usedby a monitoring tool, such as PerfMon.

[0540] Functional transport layer 166 preferably uses connectionpooling, which allows multiple connection objects to be pooled andshared transparently among requesting clients. By reusing openconnections, the cost of connection establishment is amortized,particularly for short messages, such as control messages. A connectionmay be kept open longer than absolutely required in the expectation thatanother request will be sent over it. Connection pooling also aggregatesand multiplexes physical connections (the sockets) in logical sessionsbetween the VFN receiver and VFN transmitter. When using pooling, layer166 attempts to avoid permanent bias towards certain destinations, toavoid starvation of some destinations, and to provide fairness ofservice (i.e., proportional to traffic levels).

[0541] Communication by layer 166 is preferably synchronized: an RPCclient sends an RPC request to an RPC server and then waits for an RPCresponse to the specific RPC request. An RPC response is thus alwaysassociated with an RPC request. This approach represents a blockingmodel. Preferably, the underlying HTTP sockets are persistent (i.e.,they are reused for several transactions), by making proper use of theHTTP Content-Length field. The following parameters are set for each VFNreceiver-VFN transmitter pair: minimum number of idle connections,maximum number of idle connections, and maximum number of connections.

[0542] Alternatively, the underlying sockets may not be persistent, suchas when using HTTP 1.0, which does not support persistent sockets. RPCcommunication in this cases uses the RPC client thread context.Preemptive priorities are preferably provided for communicationscheduling, in order to handle priority inversions. Priority inversionsmay occur when transmission of a low-priority message is initiatedduring a period when no high-priority messages are pending, and ahigh-priority message is subsequently generated prior to completion ofthe low-priority transfer. When such an inversion occurs, layer 166preferably preempts the ongoing lower-priority communications in orderto promptly initiate the higher-priority communication task.

[0543] Further alternatively, layer 166 may pipe RPC messages withoutmaintaining message order, using a pool of threads to send RPC requestsover a pool of open HTTP connections. Another pool of threads reads RPCresponses from the same pool of connections. This piped approachrequires pipelined HTTP support, which is an HTTP 1.1 feature. Itenables implementation of a non-blocking model. In such an approach, theRPC client preferably comprises the following components (not shown inthe figures):

[0544] Requests queue, which contains outgoing RPC requests to be sentin some order, which is not necessarily first-in-first-out. Messagepriorities are defined and a fair queuing algorithm is used to preventstarvation. The queue length may be restricted in order to set a limiton resources that can be used.

[0545] Writers, which are one or more threads that extract RPC requestsfrom the queue and send them over one or more HTTP connections.

[0546] Readers, which are one or more threads that receive RPC responsesfrom one or more HTTP connections. Each response is returned to theappropriate RPC request issuer. The RPC responses may returnout-of-order, that is, in a different order from that in which theircorresponding RPC requests were sent.

[0547] The issuer of an RPC request may block until the RPC responsearrives, or it may be non-blocking, in which case it is notified whenthe RPC response has been received. In both cases, the parametersprovided by application layer 40 are preferably not modified until theRPC request has been sent.

[0548] Further alternatively, RPC messages may be aggregated and sentasynchronously. With this approach, several RPC requests and/or RPCresponses are aggregated into a single HTTP message. The number of RPCmessages included in the same HTTP message can vary.

[0549] Unique identifiers must be provided for messages, as describedabove, because RPC messages often arrive out of order. This approachallows delayed and disconnected operation of application transport layer46. Both this aggregated approach and the piped approach described aboveprovide more efficient utilization of the HTTP connections, thusreducing the waiting time of clients for responses.

[0550] RCP messages over HTTP are preferably HTTP-compliant,particularly the Request-Line field, the Status-Line field, and thestandard HTTP headers. In addition, the following RPC-related HTTPheaders are used:

[0551] RPC-Version, for the version of the RPC protocol

[0552] RPC-Msg-ID, which is an identification number associated witheach HTTP RPC message, allowing, for example, correlation betweenrequests and responses or managing RPC semi-reliable message delivery.(This header is not relevant in the aggregated approach describedabove). Alternatively, the identifier is implemented as an internal RPCdata field, rather than as an HTTP header.

[0553] The following general HTTP headers are also used:

[0554] Hostname

[0555] Content-Type: either text/xml or multipart/related

[0556] Content-Length (as described above)

[0557] When possible, functional transport layer 166 uses datacompression. For example, the Transfer-Encoding HTTP header may be usedfor compressing the entire HTTP message content.

[0558] Error Detection and Handling

[0559] Several types of errors may occur in application transport layer46:

[0560] Transport errors, such as connection refused, HTTP protocolerrors (incorrect headers, misuse of HTTP, wrong URL path, etc.) andsocket timeouts.

[0561] Internal (local) errors, such as wrong object types (noserializer/deserializer found), and no available service for a specificmethod.

[0562] RPC protocol errors, such as incorrect RPC version and incorrectmessage structure.

[0563] Preferably, the application transport layer shields the higherprotocol layers from these errors. Optionally, application layers 40 and42 are notified of the occurrence of some or all of these errors, usinga meaningful set of error codes. Upon notification, the applicationlayers preferably log or handle the errors. For example, in certaincases, the application layer may set a “disconnection” flag for aspecific RPC server. The application transport layer is preferably failsafe: RPC clients and RPC servers assume that the other may crash andare able to recover from such crashes. When necessary, applicationlayers 40 and 42 can cancel ongoing or waiting requests.

Redirection Control

[0564] The VFN system provides means for redirecting requests fromclients 28 to their local VFN receiver 48. Redirection is describedbelow for HTTP, NFS, and SMB resources. Methods of redirection for otherresources will be evident to those skilled in the art.

[0565] HTTP

[0566] The VFN receiver is configured to function as an HTTP proxy forHTTP client requests to the VFN transmitter, by using the proxy autoconfiguration (PAC) mechanism. This mechanism is supported by bothNetscape® and Microsoft Internet Explorer browsers. Manual configurationmay also be used, but it does not allow selective proxying.Alternatively, DNS-based redirection may be used, in which case thelocal DNS server forwards requests (using the zone forwarding feature)to the VFN DNS. Further alternatively, WCCPv2-like redirection ofspecific IP addresses and ports is supported.

[0567] NFS

[0568] The VFN system uses the standard NFS mount protocol. NFS clienthosts mount the VFN receiver that resides on the local LAN, wherein thename of the mounted file system may be identical to the remote path. Thelocal VFN receiver subsequently handles access to remote files.

[0569] SMB

[0570] The standard “mount” facility for SMB is used, by mapping anetwork drive to a directory on the VFN receiver that resides in thesame LAN.

[0571] The VFN request redirection preferably provides automaticfail-over to the origin server if a VFN receiver or VFN transmitterfails.

[0572] Although some features of preferred embodiments are describedherein as being implemented on both a VFN transmitter and a VFNreceiver, these features may similarly applied to different combinationsof clients, origin servers, VFN transmitters, and VFN receivers. Forexample, features may be implemented on a file system client and fileserver, without a VFN transmitter or VFN receiver. Additionally,features may be implemented on a client and VFN transmitter thancommunicate with one another, without a VFN receiver, or on a VFNreceiver and server that communicate with one another, without a VFNtransmitter.

[0573] Moreover, although preferred embodiments of the present inventionhave been described with respect to interception of network file systemprotocol requests, some aspects of the present invention can beimplemented using file system drivers accessible by local networkclients.

[0574] Furthermore, although preferred embodiments are described hereinwith reference to certain communication protocols, programming languagesand file systems, the principles of the present invention may similarlybe applied using other protocols, languages and file systems. It willthus be appreciated by persons skilled in the art that the presentinvention is not limited to what has been particularly shown anddescribed hereinabove. Rather, the scope of the present inventionincludes both combinations and subcombinations of the various featuresdescribed hereinabove, as well as variations and modifications thereofthat are not in the prior art, which would occur to persons skilled inthe art upon reading the foregoing description.

1-76. (Canceled)
 77. A method for enabling access to a data resource,which is held on a file server on a first local area network (LAN), by aclient on a second LAN, the method comprising: conveying a replica ofthe data resource over a wide area network (WAN) from the file server toa cache held by a proxy receiver on the second LAN; intercepting at theproxy receiver a file system request for the data resource submitted bythe client over the second LAN; checking the cache to determine whetherthe replica of the data resource is present in the cache and valid; andresponsive to the file system request and to determining that thereplica is present and valid, serving the replica of the data resourcefrom the cache of the proxy receiver to the client over the second LAN.78. A method according to claim 77, wherein the data resource comprisesat least one of a file, a block of a file, a page of content encoded ina markup language, and a file system directory. 79-81. (Canceled)
 82. Amethod according to claim 77, wherein conveying the replica of the dataresource comprises conveying at least one of metadata, an access listapplicable to the data resource, and a permission applicable to the dataresource. 83-84. (Canceled)
 85. A method according to claim 77, whereinthe request for the data resource is submitted by the client using acall to a native network file system used by the file server. 86.(Canceled)
 87. A method according to claim 77, wherein conveying thereplica comprises monitoring the file server using a watchdog agent todetect a change made to the data resource by a native client on thefirst LAN, and conveying the replica of the data resource from the fileserver to the proxy receiver again responsive to the change.
 88. Amethod according to claim 77, wherein the data resource is a filecomprising a plurality of file blocks, and wherein conveying the replicacomprises analyzing a pattern of access by the client to the fileblocks, and conveying replicas of a portion of the file blocks not yetrequested by the client, responsive to the pattern.
 89. (Canceled)
 90. Amethod according to claim 77, wherein serving the replica comprisesperiodically checking at the proxy receiver whether the replica of thedata resource in the cache of the proxy receiver is consistent with thedata resource held by the file server, and deleting the replica from thecache upon determining that the replica is not consistent.
 91. A methodaccording to claim 77, and comprising deleting the replica from thecache responsive to a predetermined cache removal policy.
 92. A methodaccording to claim 77, and comprising conveying to the proxy receivermetadata regarding the data resource on the file server and, responsiveto the metadata, presenting to the client a virtual directory of thefile server.
 93. A method according to claim 77, wherein interceptingthe request comprises intercepting a lock request submitted by theclient for a lock on the data resource, and wherein conveying thereplica over the WAN comprises transmitting a lock message via the WANfrom the proxy receiver to the file server, requesting the lock, andcomprising: responsive to the lock message, issuing the lock at the fileserver; conveying the lock over the WAN from the file server to theproxy receiver; and serving the lock from the proxy receiver to theclient.
 94. A method according to claim 77, wherein conveying thereplica of the data resource from the file server to the cache held bythe proxy receiver comprises determining whether the data resource isheld by the file server, and conveying a negative response relating tothe data resource from the file server to the proxy receiver when it isdetermined that the data resource is not held by the file server, andcomprising caching the negative response at the proxy receiver for acertain period.
 95. A method according to claim 94, wherein serving thereplica of the data resource from the cache of the proxy receiver to theclient comprises checking whether the negative response relating to therequested data resource is present and not expired, and, responsive todetermining that the negative response is present and not expired,serving the negative response from the proxy receiver to the client overthe second LAN.
 96. A method according to claim 77, wherein interceptingthe request comprises intercepting a file system request submitted bythe client for an operation on the data resource, and whereintransmitting the message comprises transmitting the file system requestand a request for a lock via the WAN from the proxy receiver to the fileserver, and comprising, responsive to the request for the lock,obtaining the lock from the file server at the proxy receiver.
 97. Amethod according to claim 96, and comprising, if the proxy receiverintercepts no more file system requests from the client with respect tothe data resource for a certain period, issuing an unlock request fromthe proxy receiver to the file server with respect to the data resource.98. A method according to claim 77, wherein intercepting the requestcomprises intercepting the request for the data resource submitted inaccordance with a first native network file system of the client, andwherein conveying the replica comprises: translating the request for thedata resource from the first native network file system to a secondnative network file system used by the file server, requesting theresource from the file server using the translated request, andconveying the replica of the data source to the proxy receiver over theWAN.
 99. A method according to claim 77, wherein conveying the replicaof the data resource over the WAN comprises ascertaining an availablebandwidth of the WAN, and conveying the replica using a portion of thebandwidth that is less than a total available bandwidth, responsive to amanagement directive downloaded to the proxy receiver over the WAN. 100.A method according to claim 77, and comprising, upon determining thatthe replica is not present or not valid, requesting that the replica beconveyed again from the file server to the proxy receiver. 101-104.(Canceled)
 105. A method according to claim 77, and comprisingperforming an operation on the replica of the data resource in the cacheresponsive to a management directive downloaded to the proxy receiverover the WAN.
 106. (Canceled)
 107. A method according to claim 77,wherein intercepting the request comprises intercepting a group of oneor more requests for first data resources on the file server, andcomprising analyzing a pattern of the group of requests, and retrievingreplicas of one or more second data resources from the file server tothe cache of the proxy receiver, responsive to the pattern.
 108. Amethod according to claim 107, wherein retrieving the replicas of theone or more second data resources comprises retrieving the second dataresources before the client requests the second data resources.
 109. Amethod according to claim 107, wherein analyzing the pattern comprisescalculating for each of the second data resources on the file server arelation of an expected usage of the replicas of the second dataresources at the proxy receiver to an expected modification rate of thesecond data resources at the file server.
 110. A method according toclaim 107, wherein retrieving the replicas of the one or more seconddata resources comprises analyzing a relation of an available bandwidthof the WAN to an expected usage of the replicas of the second dataresources at the proxy receiver, and determining, responsive to therelation, when to retrieve a replica of the second data resource. 111.(Canceled)
 112. A method according to claim 107, wherein retrievingreplicas of the one or more second data resources comprises determiningan order of retrieval of the second data resources responsive to apredetermined retrieval policy, and conveying the replicas over the WANin the determined order.
 113. A method according to claim 112, whereinin accordance with the retrieval policy, the first data resourcesrequested by the client are retrieved with a higher priority than thesecond data resources.
 114. A method according to claim 77, andcomprising intercepting at the proxy receiver a write request submittedby the client for application to the data resource, and passing thewrite request over the WAN from the proxy receiver to the file server.115-116. (Canceled)
 117. A method for enabling access to data resourcesheld on a file server on a first local area network (LAN) by a client ona second LAN, the method comprising: reading metadata from the fileserver using a proxy transmitter on the first LAN; transmitting themetadata via a wide area network (WAN) from the proxy transmitter to aproxy receiver on the second LAN; and based on the metadata,constructing at the proxy receiver a directory of the data resources onthe file server, for use by the client in accessing the data resources.118. (Canceled)
 119. A method according to claim 117, wherein themetadata includes file attributes of the data resources, which fileattributes are stored in a directory object on the file server, andwherein reading the metadata comprises reading the file attributes fromthe directory object.
 120. A method according to claim 117, wherein thedata resources comprise files, and wherein the metadata includes fileattributes that are stored in the files, and wherein reading themetadata comprises reading the file attributes from the files.
 121. Amethod according to claim 117, and comprising intercepting at the proxyreceiver a file system request with respect to one of the data resourcesin the directory submitted by the client over the second LAN, and,responsive to the file system request, serving data from the one of thedata resources from the proxy receiver to the client over the secondLAN. 122-200. (Canceled)
 201. Apparatus for enabling access to a dataresource, which is held on a file server on a first local area network(LAN), by a client on a second LAN, the apparatus comprising a proxyreceiver, which is located on the second LAN and comprises a cache, andwhich is adapted to retrieve a replica of the data resource from thefile server over a wide area network (WAN) to the cache, to intercept afile system request for the data resource submitted by the client overthe second LAN, to check the cache to determine whether the replica ofthe data resource is present in the cache and valid, and, responsive tothe file system request and to determining that the replica is presentand valid, to serve the replica of the data resource from the cache tothe client over the second LAN.
 202. Apparatus according to claim 201,wherein the data resource comprises at least one of a file, a block of afile, a page of content encoded in a markup language, and a directory.203-205. (Canceled)
 206. Apparatus according to claim 201, wherein theproxy receiver is adapted to retrieve from the file server at least oneof metadata, an access list applicable to the data resource, and apermission applicable to the data resource. 207-208. (Canceled) 209.Apparatus according to claim 201, wherein the request for the dataresource is submitted by the client using a call to a native networkfile system used by the file server.
 210. (Canceled)
 211. Apparatusaccording to claim 201, and comprising a watchdog agent, which isadapted to monitor the file server to detect a change made to the dataresource by a native client on the first LAN, wherein the proxy receiveris adapted to retrieve the replica of the data resource again from thefile server responsive to the change.
 212. Apparatus according to claim201, wherein the data resource is a file comprising a plurality of fileblocks, and wherein the proxy receiver is adapted to analyze a patternof access by the client to the file blocks, and to retrieve from thefile server replicas of a portion of the file blocks not yet requestedby the client, responsive to the pattern.
 213. (Canceled)
 214. Apparatusaccording to claim 201, wherein the proxy receiver is adapted toperiodically check whether the replica of the data resource in the cacheis consistent with the data resource held by the file server, and todelete the replica from the cache upon determining that the replica isnot consistent.
 215. Apparatus according to claim 201, wherein the proxyreceiver is adapted to delete the replica from the cache responsive to apredetermined cache removal policy.
 216. Apparatus according to claim201, wherein the proxy receiver is adapted to retrieve from the fileserver metadata regarding the data resource on the file server, and topresent to the client a virtual directory of the file server, responsiveto the metadata.
 217. Apparatus according to claim 201, wherein theproxy receiver is adapted to intercept a request submitted by the clientfor a lock on the data resource, to transmit a lock message via the WANto the file server, requesting the lock, to receive over the WAN a lockissued by the file server, and to serve the lock to the client. 218.Apparatus according to claim 201, wherein the proxy receiver is adaptedto determine whether the data resource is held by the file server, andto cache a negative response relating to the data resource for a certainperiod, when it is determined that the data resource is not held by thefile server.
 219. Apparatus according to claim 218, wherein the proxyreceiver is adapted to check whether the negative response relating tothe requested data resource is present and not expired, and, responsiveto determining that the negative response is present and not expired, toserve the negative response to the client over the second LAN. 220.Apparatus according to claim 201, wherein the proxy receiver is adaptedto intercept a file system request submitted by the client for anoperation on the data resource, and to send the file system request anda request for a lock via the WAN to the file server, and wherein theproxy receiver is adapted to obtain the lock from the file server,responsive to the request for the lock.
 221. Apparatus according toclaim 220, wherein the proxy receiver is adapted to issue an unlockrequest to the file server with respect to the data resource, if theproxy receiver intercepts no more file system requests from the clientwith respect to the data resource for a certain period.
 222. Apparatusaccording to claim 201, wherein the proxy receiver is adapted tointercept the request for the data resource submitted in accordance witha first native network file system of the client, to translate therequest for the data resource from the first native network file systemto a second native network file system used by the file server, torequest the resource from the file server using the translated request,and to retrieve from the file server the replica of the data source overthe WAN.
 223. Apparatus according to claim 201, wherein the proxyreceiver is adapted to ascertain an available bandwidth of the WAN, andto retrieve from the file server the replica using a portion of thebandwidth that is less than a total available bandwidth, response to amanagement directive downloaded to the proxy receiver over the WAN. 224.Apparatus according to claim 201, wherein the proxy receiver is adaptedto request that the replica be conveyed again from the file server tothe proxy receiver, upon determining that the replica is not present ornot valid. 225-228. (Canceled)
 229. Apparatus according to claim 201,wherein the proxy receiver is adapted to perform an operation on thereplica of the data resource in the cache responsive to a managementdirective downloaded to the proxy receiver over the WAN.
 230. (Canceled)231. Apparatus according to claim 201, wherein the proxy receiver isadapted to intercept a group of one or more requests for first dataresources on the file server, to analyze a pattern of the group ofrequests, and to retrieve replicas of one or more second data resourcesfrom the file server to the cache, responsive to the pattern. 232.Apparatus according to claim 231, wherein the proxy receiver is adaptedto retrieve the replicas of the one or more second data resources beforethe client requests the second data resources.
 233. Apparatus accordingto claim 231, wherein the proxy receiver is adapted to calculate foreach of the second data resources on the file server a relation of anexpected usage of the replicas of the second data resources at the proxyreceiver to an expected modification rate of the second data resourcesat the file server, and to retrieve the replicas from the file server tothe cache, responsive to the calculation.
 234. Apparatus according toclaim 231, wherein the proxy receiver is adapted to analyze a relationof an available bandwidth of the WAN to an expected usage of thereplicas of the second data resources at the proxy receiver, and todetermine, responsive to the relation, when to retrieve a replica of thesecond data resource.
 235. (Canceled)
 236. Apparatus according to claim231, wherein the proxy receiver is adapted to determine an order ofretrieval of the second data resources responsive to a predeterminedretrieval policy, and to retrieve, the replicas from the file serverover the WAN in the determined order.
 237. Apparatus according to claim236, wherein the proxy receiver is adapted to retrieve the first dataresources requested by the client with a higher priority than the seconddata resources, in accordance with the retrieval policy.
 238. Apparatusaccording to claim 201, wherein the proxy receiver is adapted tointercept a write request submitted by the client for application to thedata resource, and to pass the write request over the WAN to the fileserver. 239-240. (Canceled)
 241. Apparatus for enabling access to dataresources held on a file server on a first local area network (LAN) by aclient on a second LAN, the apparatus comprising: a proxy transmitter,located on the first LAN and adapted to read metadata from the fileserver, to transmit the metadata via a wide area network (WAN) to thesecond LAN; and a proxy receiver, located on the second LAN, which isadapted to construct a directory, based on the metadata, of the dataresources on the file server, for use by the client in accessing thedata resources.
 242. (Canceled)
 243. Apparatus according to claim 241,wherein the metadata includes file attributes of the data resources,which file attributes are stored in a directory object on the fileserver, and wherein the proxy transmitter is adapted to read the fileattributes from the directory object.
 244. Apparatus according to claim241, wherein the data resources comprise files, and wherein the metadataincludes file attributes that are stored in the files, and wherein theproxy transmitter is adapted to read the file attributes from the files.245. Apparatus according to claim 241, wherein the proxy receiver isadapted to intercept a file system request with respect to one of thedata resources in the directory submitted by the client over the secondLAN, and, responsive to the file system request, to serve data from theone of the data resources to the client over the second LAN. 246-324.(Canceled)
 325. A computer software product for enabling access to adata resource, which is held on a file server on a first local areanetwork (LAN), by a client on a second LAN, the product comprising acomputer-readable medium, in which program instructions are stored,which instructions, when read by a computer on the second LAN, cause thecomputer to operate as a proxy receiver having a cache, so as toretrieve a replica of the data resource from the file server over a widearea network (WAN) to the cache, to intercept a file system request forthe data resource submitted by the client over the second LAN, to checkthe cache to determine whether the replica of the data resource ispresent in the cache and valid, and, responsive to the file systemrequest and to determining that the replica is present and valid, toserve the replica of the data resource from the cache to the client overthe second LAN. 326-364. (Canceled)
 365. A computer software product forenabling access to data resources held on a file server on a first localarea network (LAN) by a client on a second LAN, the product comprising acomputer-readable medium, in which program instructions are stored,which instructions, when read by a first computer on the first LAN,cause the first computer to operate as a proxy transmitter, so as toread metadata from the file server, and to transmit the metadata via awide area network (WAN) to the second LAN, and which instructions, whenread by a second computer on the second LAN, cause the second computerto operate as a proxy receiver, and to construct a directory, based onthe metadata, of the data resources on the file server, for use by theclient in accessing the data resources. 366-372. (Canceled)